<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Jukka,</p>
<p>well, we have used up to now the same trick as a famous vendor
did with their flagship text processing editor for Mac decades
ago: add explicit sleep() to make the process slower, to
discourage users from creating too large GeoJSON files, which are
difficult to read if too big.</p>
<p>More seriously, some modest enhancements for GML and GeoJSON in
<a class="moz-txt-link-freetext" href="https://github.com/OSGeo/gdal/pull/11428">https://github.com/OSGeo/gdal/pull/11428</a></p>
<p>With them, I get 1m56s for whole file GeoJSON conversion (2m20s
before) and 1m36s for GML (1m45s before).<br>
</p>
<p>I found on my Linux system that MIF export was the fastest of the
4 text formats, not sure why that isn't the case on Windows.<br>
</p>
<p>Why is ExportGeoJSON so fast? Completely hand-written compared to
the OGR GeoJSON driver which constructs a json_object*
hierarchical representation of each feature before serializing it
to string, the fact that the OGR GeoJSON driver implements
"smart" rounding/truncation logic, and possibly (didn't check) the
fact the the sqlite3_mprintf() routine is faster than standard
library printf().<br>
</p>
<p>Even<br>
</p>
<div class="moz-cite-prefix">Le 28/11/2024 à 14:43, Rahkonen Jukka
via gdal-dev a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:DB9PR09MB688781DA1B7635789F11A853FD292@DB9PR09MB6887.eurprd09.prod.outlook.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="Generator"
content="Microsoft Word 15 (filtered medium)">
<style>@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}span.Shkpostityyli17
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}.MsoChpDefault
{mso-style-type:export-only;
mso-ligatures:none;
mso-fareast-language:EN-US;}div.WordSection1
{page:WordSection1;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span lang="EN-US">I was comparing some
alternative scenarios for data exports, and I was a bit
surprised when I noticed that GeoJSON output from ogr2ogr is
really slow.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I used these lake
polygons as test data <a
href="https://wwwd3.ymparisto.fi/d3/gis_data/spesific/ranta10jarvet.zip"
moz-do-not-send="true" class="moz-txt-link-freetext">
https://wwwd3.ymparisto.fi/d3/gis_data/spesific/ranta10jarvet.zip</a>
and I tested on Windows with GDAL 3.11.0dev-181b6b9991,
released 2024/11/21.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I was thinking that
maybe it is slow to write JSON just because it is text based
format so I made tests also with other text formats (GML,
MapInfo MIF, and CSV). My commands and timings:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f geojson
lakes.json jarvi10.shp --config cpl_debug on --config
cpl_timestamp on<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">220 sec - 1000
features/sec<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f "mapinfo
file" lakes.mif jarvi10.shp --config cpl_debug on --config
cpl_timestamp on<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">110 sec – 2000
features/sec<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f gml lakes.gml
jarvi10.shp --config cpl_debug on --config cpl_timestamp on<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">92 sec - 2300
features/sec<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f csv lakes.csv
jarvi10.shp -lco geometry=as_wkt --config cpl_debug on
--config cpl_timestamp on<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">77 sec - 2800
featurs/sec<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">Then I pondered if I
know any other tools for exporting GeoJSON, and SpatiaLite
came into my mind. ExportGeoJSON
<a
href="https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.1.0.html"
moz-do-not-send="true" class="moz-txt-link-freetext">https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.1.0.html</a>
from GeoPackage into GeoJSON file was 4 times faster than
ogr2ogr.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">select
exportgeojson('vgpkg_jarvi10','geom','c:\data\jarvet\fromspatialite.json');<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">54 sec - 4000
features/sec<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">For calibrating the
speedometer, I converted data also from shapefile into
GeoPackage<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f gpkg
lakes.gpkg jarvi10.shp --config cpl_debug on --config
cpl_timestamp on<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">12 sec - 18000
features/sec<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">I made also a couple of
tests with geojsonseq output but I did not notice much
difference. Does writing GeoJSON require some tricks that
other formats do not require, or why it is so slow?<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">-Jukka Rahkonen-<o:p></o:p></span></p>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
gdal-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.
Butcher of all kinds of standards, open or closed formats. At the end, this is just about bytes.</pre>
</body>
</html>