[gdal-dev] Dissolve large amount of geometries

Andreas Oxenstierna ao at t-kartor.se
Mon Jul 16 02:30:35 PDT 2018


ST_Union in PostGIS should scale better than SQLite. 
ST_Dump gives you singlepart geometries. 

Best Regards

Andreas Oxenstierna



> 16 juli 2018 kl. 10:53 skrev Paul Meems <bontepaarden at gmail.com>:
> 
> Thanks, Jon for your suggestion of GeoPandas.
> Unfortunately, I'm not allowed to use new external dependencies.
> 
> I tried doing all steps in an SQLite file instead of using several intermediate shapefiles. And I had some good results, so I created a script dissolving an increasingly higher number of shapes.
> Later I realized the performance increase was because in the new script I forgot '-explodecollections'. This makes a huge difference. For now, I'll keep the multipart polygons.
> 
> These commands I converted to a C# unit test:
> // Convert fishnet shapefile to SQLite:
> ogr2ogr -f SQLite taskmap.sqlite "Fishnet.shp" -nln fishnet -nlt POLYGON -dsco SPATIALITE=YES -lco SPATIAL_INDEX=NO -gt unlimited --config OGR_SQLITE_CACHE 4096 --config OGR_SQLITE_SYNCHRONOUS OFF 
> // Add field:
> ogrinfo taskmap.sqlite -sql "ALTER TABLE fishnet ADD COLUMN randField real"
> // Fill random values:
> ogrinfo taskmap.sqlite -sql "UPDATE fishnet SET randField = ABS(RANDOM() % 10)"
> // Create index:
> ogrinfo taskmap.sqlite -sql "CREATE INDEX randfield_idx ON fishnet (randField)"
> // Combined dissolve and export:
> ogr2ogr -f "ESRI Shapefile" -overwrite taskmap.shp taskmap.sqlite -sql "SELECT ST_Union(geometry) as geom, randField FROM fishnet GROUP BY randField" -gt unlimited --config OGR_SQLITE_CACHE 4096 --config OGR_SQLITE_SYNCHRONOUS OFF
> 
> Some timing:
> 1,677 shapes --> 0.3s
> 4,810 shapes --> 1.8s
> 18,415 shapes --> 21.4s
> 72,288 shapes --> 5min, 54s
> 285,927 shapes --> 25m
> 1,139,424 shapes --> 6h, 47m
> 4,557,696 shapes --> Still running for 34h
> 
> 4 million shapes are the amount my application needs to handle, but running for days is not an option.
> 
> I noticed my script is using only a fraction of my resources: 30% RAM (of 12GB), 22-28% CPU (on 8 cores).
> How can I let GDAL use more resources? Might it speed up the process?
> 
> I also read about CascadedUnion of GEOS. Can I also use it with GDAL/OGR? If so how?
> And would it help to enable GPU? If so, do I need a special build? I'm now using the Windows-64bit of gisinternals.com
> 
> Thanks again for any pointers and/or suggestions.
> 
> Paul
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180716/01ab1eed/attachment.html>


More information about the gdal-dev mailing list