[gdal-dev] OSM driver with z_order values

Even Rouault even.rouault at spatialys.com
Mon Nov 10 07:51:42 PST 2014

Le lundi 10 novembre 2014 16:35:04, Jukka Rahkonen a écrit :
> Even Rouault <even.rouault <at> spatialys.com> writes:
> > Jeff,
> > 
> > I've implemented the above idea in latest trunk. I've translated the
> osm2pgsql
> > rules (*) for the lines layer. Let me know if that works OK. The
> > implementation of the mechanism is rather generic, so it could
> > potentially be used to do many other things.
> Hi,
> I made some quick test for seeing the effects of disabling insert triggers
> and the creation of the new derived z_order field.
> Environment: Windows 7 64-bit, Intel Core i7-3770 3.4 GHz, rotational disks
> Test data: Finland-latest.osm.pbf 2014-11-10 (184 MB)
> Command:
> ogr2ogr -f sqlite -dsco spatialite=yes finland.sqlite
> finland-latest.osm.pbf --config OGR_SQLITE_SYNCHRONOUS OFF --config
> GDAL before r27936: 99 seconds
> GDAL with r27936: 74 seconds
> GDAL with r27936 + create z_order: 86 seconds
> Same as above + SPATIAL_INDEX=YES: 197 seconds
> Extra test:
> GDAL with r27936, no z_order, OSM_MAX_TMPFILE_SIZE 2500 MB: 66 seconds
> Heap memory usage with default OSM_MAX_TMPFILE_SIZE: 280 MB
> Heap CPU usage: task manager showed 12 % for 8 threads but the i7
> "hyperthreads" seemed to be all idle. Still only 25% CPU load if counted by
> the 4 real cores.

Yes all the processing is single-threaded. Making the OSM driver multi-
threaded would be a non-trivial effort, and only worth with very fast storage.
A less complicated possibility could be to make the reading part and writing 
part into dedicated threads in ogr2ogr.

> Conclusions:
> - Disabling the insert triggers was a good thing to do
> - Creating z_order for linestrings is pretty fast
> - 56% of the total time with Spatialite is spent for creating spatial
> indexes it they are needed (as they usually are)
> - Is is hard to believe in huge improvements in the speed of
> PBF->SQLite/Spatialite/GeoPackage conversion with GDAL any more because
> SQLite begins to be the slowest part.
> - Compared to the speed of what GDAL is doing the creation of spatial index
> inside Spatialite db starts to feel, if not sluggish, but still like
> something that could perhaps be faster. However, indexes are created only
> once and spending 10 seconds more or less time for that is not of big
> importance.
> Ideas:
> - It is irritating to know that CPU runs 75% idle. I must try to find a
> computer with SSD for some further tests.
> - For saving a few more seconds when there is free RAM available, could it
> be possible to tell GDAL to use SQLite memory db as a target and move it on
> disk as the very last step after running the slow CreateSpatialIndex
> requests which probably need to do both reads and writes from disk? I could
> save 8 seconds by increasing the size of the internal SQLite db from the
> default 100 MB to 2500 MB and I somehow feel that there is more to save
> from the 111 seconds which are now spent for creating spatial indexes.

Instead of outputing to finland.sqlite, try /vsimem/finland.sqlite. Of course 
you will have no file at the end, but it would be a good simulation of the 
potential performance gains (you should add to that number the time to copy a 
file from memory to disk).

> -Jukka Rahkonen-
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

Spatialys - Geospatial professional services

More information about the gdal-dev mailing list