[gdal-dev] Reading the same OSM data in multiple threads

Damian Dixon damian.dixon at gmail.com
Thu Jun 2 03:58:18 PDT 2016


Thanks for the reply's.

We are doing a lot of processing of the data and need to retain that data
in a vector format.

For now we are disabling the multi-threading for OSM data and bumping up
the memory allowed to be allocated by a significant amount.

We will probably go with converting OSM to SpatialLite when the data is
over a certain size.

Thanks
Damian


On Wed, Jun 1, 2016 at 6:25 PM Even Rouault <even.rouault at spatialys.com>
wrote:

> Damian,
>
> >
> > I'm trying to speed up processing of OSM data by opening an OSM file into
> > multiple datasets in multiple threads. One dataset per thread. Each
> thread
> > is processing a separate section of data, basically tiling the data.
> >
> > I've however run into a scaling issue with the amount of memory allocated
> > per dataset.
> >
> > The Open in the OSM driver seems to allocate a lot of memory for buffers
> > for processing regardless of the size of the data loaded.
> >
> > So I have a couple of questions:
> >
> > 1. is there away of reducing the memory load when reading OSM in multiple
> > threads?
>
> You may play with the OSM_MAX_TMPFILE_SIZE config option that defaults to
> 100
> (MB) / dataset.
> If you are brave enough, you can edit
> ogr/ogrsf_frmts/osm/ogrosmdatasource.cpp
> and reduce the values of the #define MAX_DELAYED_FEATURES,
> MAX_ACCUMULATED_NODES and HASHED_INDEXES_ARRAY_SIZE (and possibly disabling
> ENABLE_NODE_LOOKUP_BY_HASHING in ogr_osm.h)
>
> >
> > 2. Could I convert the OSM data into a different format that can be read
> > efficiently from multiple threads? and what would that format be?
> > My thought for (2) would be to load the data into a database and read
> from
> > the database using ogr. If this is the correct way forward which database
> > would be recommended (PostGIS, SpatialLite,...) ?
>
> Reading the same OSM file from multiple threads is indeed probably an
> inefficient
> approach as they don't have spatial indices, so you'll end up reading the
> whole file completely for each tile. So prior conversion would probably be
> better for later scaling. SpatiaLite/GPKG are probably good choices.
>
> Even
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160602/42054b26/attachment-0001.html>


More information about the gdal-dev mailing list