[gdal-dev] OSM Driver and World Planet file (pbf format)

Rahkonen Jukka Jukka.Rahkonen at mmmtike.fi
Wed Aug 1 07:38:49 PDT 2012


Hi,

Right, temporary DB was written on system disk so I repeated the test. Now everything was for sure done on the same USB 2.0 disk (read pbf, write results to Spatialite and handle temporaty DB). It took a bit longer but difference was not very big: 26 minutes vs. 19 minutes when temporary DB was on the system disk.

CPU stays close to 100% on single processor XP throughout the conversion.  Same thing with a laptop having a two-core processor and running on 32-bit Vista, both cores seem burn at full power.

I made a rough parallelizing test by making 4 copies of finland.osm.pbf and running ogr2ogr in four separate windows.  This way the total CPU load of the 8 cores was staying around 50%. 
Result: All four conversions were ready after 3 minutes (45 seconds per conversion) while a single conversion takes 2 minutes.
Conclusion: 4 parellel  conversions in 3 minutes vs. within 8 minutes if performed as serial runs is much faster. 50% CPU load may tell that the speed of SATA disk is the limiting factor now.  Test with SSD drive should give more information about this.

I had a try also with 6 parellel runs but that was slower with all runs ready in 6 minutes which makes 60 seconds per conversion. With 8 runs computer jammed when all the progress bars were at 95%. 

Result was not a surprise because my experience about doing image processing with gdalwarp and gdal_translate with an 8-core server is that I can get the maximum throughput with our hardware by running processes in 4-6 windows.  If there are too many image conversions going on they start to disturb each other because disks cannot serve them all properly. However that computer behaves better when running conversions in 6 or more windows than this  laptop. Somehow it feels like the laptop has only 4 real processors/cores even the resource manager is showing eight.

I believe that by parallelizing the conversion program it is hard to take the juice as effectively from all the cores.

It may be difficult to feed rendering chain by having a bunch of source databases but it looks strongly that by splitting Germany into four distinct OSM source files it would be possible to import the whole country in 15 minutes with a good laptop.  Size of the OSM planet file is under 20 GB. Simple calculation suggests that importing the whole planet might be possible to do in 5 hours. With a laptop. Who will make a try?

-Jukka-

 Even Rouault wrote:
 
> Selon Rahkonen Jukka <Jukka.Rahkonen at mmmtike.fi>:
> 
> Interesting results. I'll wait a bit for your tests with SSD to turn
> OSM_COMPRESS_NODES to YES. Even if doesn't bring clear advantages, I
> don't think it would hurt a lot, because the extra CPU load introduced by the
> compression/decompression shouldn't be that high (the compression
> algorithm used is just encoding in Protocol Buffer of the differences of
> longitude/latitude between consecutive nodes, by chunk of 64 nodes)
> 
> Just a word of caution to remind you that the temporary node DB will be
> written in the directory pointed by the CPL_TMPDIR config. option/env.
> variable if defined, if not defined in TMPDIR, if not defined in TEMP, if not
> defined in the current directory form which ogr2ogr is started. In Windows
> system, the TEMP env. variable is generally defined, so when you test with
> your USB external driver, it is very likely that the node DB is written in the
> temporary directory associated with your Windows account.
> 
> As far as CPU load is concerned, the conversion is a single-threaded
> processus, so on a 8 core system, it is expected that it tops at 100 / 8 = 12,5 %
> of the global CPU power. With which hardware configuration and input PBF
> file do you manage to reach 100% CPU ? Is that load constant during the
> process : I imagine that it could change according to the stage of the
> conversion.
> There might be a potential for parallelizing some stuff. What comes in mind
> for now would be PBF decoding (when profiling only PBF decoding, the gzip
> decompression is the major CPU user, but not sure if it matters that much in
> a real-life ogr2ogr job) or way resolution (currently, we group ways into a
> batch until 1 million nodes or 75 000 ways have to be resolved, which leads to
> more efficient search in the node DB since we can sort nodes by increasing
> ids and avoid useless seeks). But that's not obviously immediate which would
> lead to increased efficiently. Parallelization may also generally requires more
> RAM if you need work buffers for each thread.
> 
> > Even Rouault wrote:
> > >
> > >
> > > > Another set of tests with a brand new and quite powerful laptop.
> > > >  Specs for the
> > > > computer:
> > > > Intel i7-2760QM @2.4 GHz processor (8 threads) Hitachi Travelstar
> > > > Z7K320 7200 rpm SATA disk
> > > > 8 GB of memory
> > > > Windows 7, 64-bit
> > > >
> > > > GDAL-version r24717, Win64 build from gisinternals.com
> > > >
> > > > Timings for germany.osm.pbf (1.3 GB)
> > > > ====================================
> > > >
> > > > A) Default settings with command
> > > > ogr2ogr -f sqlite -dsco spatialite=yes germany.sqlite
> > > > germany.osm.pbf -gt 20000 -progress --config
> > > > OGR_SQLITE_SYNCHRONOUS OFF
> > > >
> > > > - reading the data               67 minutes
> > > > - creating spatial indexes       38 minutes
> > > > - total                         105 minutes
> > > >
> > > > B) Using in-memory Spatialite db for the first step by giving SET
> > > > OSM_MAX_TMPFILE_SIZE=7000
> > > >
> > > > - reading the data              16 minutes
> > > > - creating spatial indexes      38 minutes
> > > > - total                         54 minutes
> > > >
> > > > Peak memory usage during this conversion was 4.4 GB.
> > > >
> > > > Conclusions
> > > > ===========
> > > > * The initial reading of data is heavily i/o bound. This phase is
> > > > really fast if there is enough memory for keeping the OSM tempfile
> > > > in memory but SSD disk seems to offer equally good performance.
> > > > * Creating spatial indexes for the Spatialite tables is also i/o
> > > > bound. The hardware sets the speed limit and there are no other
> > > > tricks for improving the performance. Multi-core CPU is quite idle
> > > > during this phase with 10-15% load.
> > > > * If user does not plan to do spatial queries then then it may be
> > > > handy to save some time and create the Spatialite db without
> > > > spatial indexes by using -lco SPATIAL_INDEX=NO option.
> > > > * Windows disk i/o may be a limiting factor.
> > > >
> > > > I consider that for small OSM datasets the speed starts to be good
> > > > enough. For me it is about the same if converting the Finnish OSM
> > > > data
> > > > (137 MB in .pbf format) takes 160 or 140 seconds when using the
> > > > default settings or in-memory temporary database, respectively.
> > >
> > > Interesting findings.
> > >
> > > A SSD is of course the ideal hardware to get efficient random access
> > > to the nodes.
> > >
> > > I've just introduced inr 24719 a new config. option
> > > OSM_COMPRESS_NODES that can be set to YES. The effect is to use a
> > > compression algorithm while storing the temporary node DB.  This can
> > > compress to a factor of 3 or 4,
> > and
> > > help keeping the node DB to a size where it is below the RAM size
> > > and that the OS can dramatically cache it (at least on Linux). This
> > > can be efficient
> > for
> > > OSM extracts of the size of the country, but probably not for a
> > > planet
> > file. In
> > > the case of Germany and France, here's the effect on my PC (SATA disk) :
> > >
> > > $ time ogr2ogr -f null null
> > > /home/even/gdal/data/osm/france_new.osm.pbf
> > > - progress --config OSM_COMPRESS_NODES YES [...]
> > > real    25m34.029s
> > > user    15m11.530s
> > > sys 0m36.470s
> > >
> > > $ time ogr2ogr -f null null
> > > /home/even/gdal/data/osm/france_new.osm.pbf
> > > - progress --config OSM_COMPRESS_NODES NO [...]
> > > real    74m33.077s
> > > user    15m38.570s
> > > sys 1m31.720s
> > >
> > > $ time ogr2ogr -f null null /home/even/gdal/data/osm/germany.osm.pbf
> > > - progress --config OSM_COMPRESS_NODES YES [...]
> > > real    7m46.594s
> > > user    7m24.990s
> > > sys 0m11.880s
> > >
> > > $ time ogr2ogr -f null null /home/even/gdal/data/osm/germany.osm.pbf
> > > - progress --config OSM_COMPRESS_NODES NO [...]
> > > real    108m48.967s
> > > user    7m47.970s
> > > sys 2m9.310s
> > >
> > > I didn't turn it to YES by default, because I'm unsure of the
> > > performance impact on SSD. Perhaps you have a chance to test.
> >
> > I cannot test with SSD before weekend but otherwise the new
> > configuration option really makes difference in some circumstances.
> >
> > I have ended up to use the following base command  in speed tests:
> > ogr2ogr -f SQLite -dsco spatialite=yes germany.sqlite germany.osm.pbf
> > -gt
> > 20000 -progress --config OGR_SQLITE_SYNCHRONOUS -lco
> SPATIAL_INDEX=NO
> >
> > Writing into Spatialite is pretty fast with these options and even
> > your null driver does not seem to be very much faster. What happens
> > after this step (like creating indexes) has nothing to do with OSM driver.
> >
> > Test with  Intel i7-2760QM @2.4 GHz processor,  7200 rpm SATA disk and
> > 1.3 GB input file 'germany.osm.pbf'
> > --config OSM_COMPRESS_NODES NO
> > 67 minutes
> > --config OSM_COMPRESS_NODES YES
> > 15 minutes
> >
> > It means 52 minutes less time or 4.5 times more speed.
> > Out of curiosity I tried what happens if I do the whole file
> > input/output by using a 2.5" external USB 2.0 drive.
> > 19 minutes!
> >
> > I made also a few tests with an old and much slower Windows computer.
> > Running osm2pgsql with Finnish OSM data with that machine takes
> > nowadays about 3 hours.
> >
> > Test with a single Intel Xeon @2.4 GHz processor and the same external
> > USB
> > 2.0 disk than in previous test
> > Input file 'finland.osm.pbf' 122 MB
> > Result: 7 minutes for both  OSM_COMPRESS_NODES NO and
> > OSM_COMPRESS_NODES YES Input file 'germany.osm.pbf' 1.3 GB
> > Result: 112 minutes with OSM_COMPRESS_NODES YES
> >
> > Conclusions:
> > * When input file has reached the limit where disk i/o  cannot utilize
> > cache properly, the compress_nodes setting can have a huge effect.
> > * When i/o works well the temporary database storage does not have
> > great effect on overall speed. In-memory db, SSD drive and fast SATA
> > drive are all equally fast and even external USB 2.0 drive is almost as fast.
> > * With small input data file size compress_node setting does not seem
> > to have much effect on speed.
> > * Obviously CPU is now the limiting factor.
> > * OGR OSM driver runs pretty well also on older computers.
> > * While single processor PC runs at 100% CPU load the 8-core PC shows
> > only 12%.  There seems to be 88% of computing power free for making
> > the memory and SSD lines saturated sometimes in the future...
> > * Instead of buying expensive hardware like SSD the lucky ones may
> > meet someone like Even who wants and can write fast programs.
> >
> > I found my first notes from less than two weeks  ago when I believed
> > that OSM driver was fast.  Now I repeated those tests with same data
> > and same computers and the progress I can see is non-negligible.
> > Result in these tests is Spatialite db with spatial indexes.
> >
> > finland.osm.pbf         40 minutes -> 7 minutes
> > germany.osm.pbf     15 hours     -> 50 minutes
> >
> > -Jukka Rahkonen-
> >
> >
> > _______________________________________________
> > gdal-dev mailing list
> > gdal-dev at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/gdal-dev
> >
> 



More information about the gdal-dev mailing list