Generating graphs consumes too much memory on large datasets


#1

I’m trying to setup a Directions API using openrouteservice following the Docker instructions. Everything seems to work fine when using a relatively small dataset, like https://download.geofabrik.de/europe/germany/berlin.html (around 54MB).

However, I’m always getting OutOfMemoryErrror when using a larger dataset, e.g. https://download.geofabrik.de/europe/dach.html (3.8GB) :

java.lang.OutOfMemoryError: Java heap space - problem when allocating new memory. Old capacity: 1048576, new bytes:105795008, segmentSizeIntsPower:20, new segments:101, existing:1

I’m working on a machine with 16GB RAM. I followed the readme instructions and modified the -Xmx parameter of JAVA_OPTS in docker-compose.yml:

JAVA_OPTS=-Djava.awt.headless=true -server -XX:TargetSurvivorRatio=75 -XX:SurvivorRatio=64 -XX:MaxTenuringThreshold=3 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=4 -Xms16g -Xmx16g

Is there something else I can do for not running into this memory issue with large datasets? Or is it simply not possible to run datasets this large on a 16GB RAM machine?

Thanks in advance!


#2

Huiii, no, that’s way too little RAM for Europe. Also, never set the Java heap settings to all your available RAM.

You can run single countries or even mutliple countries on that hardware, depending on the countries. A good estimate is: PBF size * 1.5-2 for RAM. And don’t use more than 70-80% of your available RAM if it’s the laptop running your OS with GUI and all.


#3

Thanks for the estimation on RAM usage - that’s helpful! Despite setting Java heap to 16GB, the running Docker container never exceeded ~4GB. Maybe there is some OS-enforced limit? By default, Docker does not have any resource limits (https://docs.docker.com/config/containers/resource_constraints/) and I didn’t specify otherwise. Just wondering if there might be another issue since I tried it out with a smaller dataset (< 1GB) and still there was the same OutOfMemoryError.

Maybe a little bit more background information: I don’t actually need all of the D-A-CH dataset, just around 5 specific cities in that area. Plus, directions should only be calculated within a specific city, i.e. calculating directions from city A to city B is not required.

-> Is it possible to get separate datasets from each city, merge them (maybe with https://osmcode.org/osmium-tool/) and use them all together for the use case described above? Is that a valid approach or will I run into other problems with this approach?

Thanks again.


#4

Do the merge with osmium, you won’t run into trouble with that. Get your PBFs from geofabrik for the cities you need (might involve some extracting by bounding box).

Of course you won’t be able to route from city to city with that approach, but within the cities won’t be a problem.