[gdal-dev] Does writing GeoJSON need to be so slow?

Even Rouault even.rouault at spatialys.com
Tue Dec 3 10:21:46 PST 2024


Jukka,

well, we have used up to now the same trick as a famous vendor did with 
their flagship text processing editor for Mac decades ago: add explicit 
sleep() to make the process slower, to discourage users from creating 
too large GeoJSON files, which are difficult to read if too big.

More seriously, some modest enhancements for GML and GeoJSON in 
https://github.com/OSGeo/gdal/pull/11428

With them, I get 1m56s for whole file GeoJSON conversion (2m20s before) 
and 1m36s for GML (1m45s before).

I found on my Linux system that MIF export was the fastest of the 4 text 
formats, not sure why that isn't the case on Windows.

Why is ExportGeoJSON so fast? Completely hand-written compared to the 
OGR GeoJSON driver which constructs a json_object* hierarchical 
representation of each feature before serializing it to string,  the 
fact that the OGR GeoJSON driver implements "smart" rounding/truncation 
logic, and possibly (didn't check) the fact the the sqlite3_mprintf() 
routine is faster than standard library printf().

Even

Le 28/11/2024 à 14:43, Rahkonen Jukka via gdal-dev a écrit :
>
> Hi,
>
> I was comparing some alternative scenarios for data exports, and I was 
> a bit surprised when I noticed that GeoJSON output from ogr2ogr is 
> really slow.
>
> I used these lake polygons as test data 
> https://wwwd3.ymparisto.fi/d3/gis_data/spesific/ranta10jarvet.zip and 
> I tested on Windows with GDAL 3.11.0dev-181b6b9991, released 2024/11/21.
>
> I was thinking that maybe it is slow to write JSON just because it is 
> text based format so I made tests also with other text formats (GML, 
> MapInfo MIF, and CSV). My commands and timings:
>
> ogr2ogr -f geojson lakes.json jarvi10.shp --config cpl_debug on 
> --config cpl_timestamp on
>
> 220 sec - 1000 features/sec
>
> ogr2ogr -f "mapinfo file" lakes.mif jarvi10.shp --config cpl_debug on 
> --config cpl_timestamp on
>
> 110 sec – 2000 features/sec
>
> ogr2ogr -f gml lakes.gml jarvi10.shp --config cpl_debug on --config 
> cpl_timestamp on
>
> 92 sec - 2300 features/sec
>
> ogr2ogr -f csv lakes.csv jarvi10.shp -lco geometry=as_wkt --config 
> cpl_debug on --config cpl_timestamp on
>
> 77 sec - 2800 featurs/sec
>
> Then I pondered if I know any other tools for exporting GeoJSON, and 
> SpatiaLite came into my mind. ExportGeoJSON 
> https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.1.0.html from 
> GeoPackage into GeoJSON file was 4 times faster than ogr2ogr.
>
> select 
> exportgeojson('vgpkg_jarvi10','geom','c:\data\jarvet\fromspatialite.json');
>
> 54 sec - 4000 features/sec
>
> For calibrating the speedometer, I converted data also from shapefile 
> into GeoPackage
>
> ogr2ogr -f gpkg lakes.gpkg jarvi10.shp --config cpl_debug on --config 
> cpl_timestamp on
>
> 12 sec - 18000 features/sec
>
> I made also a couple of tests with geojsonseq output but I did not 
> notice much difference. Does writing GeoJSON require some tricks that 
> other formats do not require, or why it is so slow?
>
> -Jukka Rahkonen-
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.
Butcher of all kinds of standards, open or closed formats. At the end, this is just about bytes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20241203/a54d8665/attachment.htm>


More information about the gdal-dev mailing list