[gdal-dev] New OGR driver to read OpenStreetMap .osm / .pbf files

Even Rouault even.rouault at mines-paris.org
Mon Jul 23 03:27:27 PDT 2012

Le lundi 23 juillet 2012 00:09:05, Jukka Rahkonen a écrit :
> Even Rouault <even.rouault <at> mines-paris.org> writes:
> > > > However, select with SQL feels sub-optimal.
> > > 
> > > Yes, when you use ogr2ogr with explicit layer names, there are
> > > optimizations. For example, when you only specify the layer 'points',
> > > the OSM driver will not even try to index the nodes into the temporary
> > > database because it is not needed. However, as you noticed, there is
> > > not yet any optimization when a SQL request is specified.
> > 
> > Optimization for SQL request added in r24690
> I had a try with r24696 today, downloaded from
> http://gisinternals.com/sdk/Download.aspx?file=release-1500-gdal-mapserver.
> zip
> Filtered commands give me errors. An example:
> ogr2ogr -f "ESRI Shapefile" test germany.osm.pbf multipolygons -gt 20000
> -progress --config OGR_SQLITE_SYNCHRONOUS OFF -where "natural='forest'"
> ERROR 1: Failed inserting node 420797898: database schema has changed

I could also reproduce and I suppose there was at the begnning of the sequence 
of errors : "ERROR 1: Failed inserting node XXXX: I/O error"

It turned out that the mechanism to transfer the temporary in-memory DB file to 
disk when it is too big had been broken by a previous commit, so it stayed on 
RAM and at some point, it couldn't fit in RAM, hence the error. Should be fixed 
now with r24699.

I'm experimenting with removing the internal use of SQLite for the temporary 
database and replacing it with something custom. Actually, it won't replace it 
completely in all cases, but it could definitely be used in well-behaved cases 
where the elements in the .osm/.pbf are listed in increasing id order, which 
is the case of the data in geofabrik files for example. The first results seem 
to show increased performance.

Note: in your above example, you don't need to specify -gt and --config 
OGR_SQLITE_SYNCHRONOUS OFF when the output format is not sqlite/spatialite. 
And the internal use of SQLite in the OSM driver already sets the 
corresponding parameters to the values that give the best performance.

> Same error with Spatialite output.
> -Jukka Rahkonen-
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

More information about the gdal-dev mailing list