<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML-esimuotoiltu Char";
margin:0cm;
font-size:10.0pt;
font-family:"Courier New";
mso-fareast-language:FI;}
span.HTML-esimuotoiltuChar
{mso-style-name:"HTML-esimuotoiltu Char";
mso-style-priority:99;
mso-style-link:HTML-esimuotoiltu;
font-family:Consolas;
mso-fareast-language:EN-US;}
span.Shkpostityyli22
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 2.0cm 70.85pt 2.0cm;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="FI" link="#0563C1" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">Hi<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span lang="EN-US">20% more speed into GeoJSON writing is notable. I do not use GeoJSON so much but some other people in the community may be happy. Thanks for having a look.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US">-Jukka-<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="mso-fareast-language:FI">Lähettäjä:</span></b><span style="mso-fareast-language:FI"> Even Rouault <even.rouault@spatialys.com>
<br>
<b>Lähetetty:</b> tiistai 3. joulukuuta 2024 20.22<br>
<b>Vastaanottaja:</b> Rahkonen Jukka <jukka.rahkonen@maanmittauslaitos.fi>; 'gdal-dev@lists.osgeo.org' (gdal-dev@lists.osgeo.org) <gdal-dev@lists.osgeo.org><br>
<b>Aihe:</b> Re: [gdal-dev] Does writing GeoJSON need to be so slow?<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p>Jukka,<span style="mso-fareast-language:FI"><o:p></o:p></span></p>
<p>well, we have used up to now the same trick as a famous vendor did with their flagship text processing editor for Mac decades ago: add explicit sleep() to make the process slower, to discourage users from creating too large GeoJSON files, which are difficult
to read if too big.<o:p></o:p></p>
<p>More seriously, some modest enhancements for GML and GeoJSON in <a href="https://github.com/OSGeo/gdal/pull/11428">
https://github.com/OSGeo/gdal/pull/11428</a><o:p></o:p></p>
<p>With them, I get 1m56s for whole file GeoJSON conversion (2m20s before) and 1m36s for GML (1m45s before).<o:p></o:p></p>
<p>I found on my Linux system that MIF export was the fastest of the 4 text formats, not sure why that isn't the case on Windows.<o:p></o:p></p>
<p>Why is ExportGeoJSON so fast? Completely hand-written compared to the OGR GeoJSON driver which constructs a json_object* hierarchical representation of each feature before serializing it to string, the fact that the OGR GeoJSON driver implements "smart"
rounding/truncation logic, and possibly (didn't check) the fact the the sqlite3_mprintf() routine is faster than standard library printf().<o:p></o:p></p>
<p>Even<o:p></o:p></p>
<div>
<p class="MsoNormal">Le 28/11/2024 à 14:43, Rahkonen Jukka via gdal-dev a écrit :<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">I was comparing some alternative scenarios for data exports, and I was a bit surprised when I noticed that GeoJSON output from ogr2ogr is really slow.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">I used these lake polygons as test data <a href="https://wwwd3.ymparisto.fi/d3/gis_data/spesific/ranta10jarvet.zip">
https://wwwd3.ymparisto.fi/d3/gis_data/spesific/ranta10jarvet.zip</a> and I tested on Windows with GDAL 3.11.0dev-181b6b9991, released 2024/11/21.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">I was thinking that maybe it is slow to write JSON just because it is text based format so I made tests also with other text formats (GML, MapInfo MIF, and CSV). My commands and timings:</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f geojson lakes.json jarvi10.shp --config cpl_debug on --config cpl_timestamp on</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">220 sec - 1000 features/sec</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f "mapinfo file" lakes.mif jarvi10.shp --config cpl_debug on --config cpl_timestamp on</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">110 sec – 2000 features/sec</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f gml lakes.gml jarvi10.shp --config cpl_debug on --config cpl_timestamp on</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">92 sec - 2300 features/sec</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f csv lakes.csv jarvi10.shp -lco geometry=as_wkt --config cpl_debug on --config cpl_timestamp on</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">77 sec - 2800 featurs/sec</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Then I pondered if I know any other tools for exporting GeoJSON, and SpatiaLite came into my mind. ExportGeoJSON
<a href="https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.1.0.html">https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.1.0.html</a> from GeoPackage into GeoJSON file was 4 times faster than ogr2ogr.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">select exportgeojson('vgpkg_jarvi10','geom','c:\data\jarvet\fromspatialite.json');</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">54 sec - 4000 features/sec</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">For calibrating the speedometer, I converted data also from shapefile into GeoPackage</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">ogr2ogr -f gpkg lakes.gpkg jarvi10.shp --config cpl_debug on --config cpl_timestamp on</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">12 sec - 18000 features/sec</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">I made also a couple of tests with geojsonseq output but I did not notice much difference. Does writing GeoJSON require some tricks that other formats do not require, or why it is so slow?</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">-Jukka Rahkonen-</span><o:p></o:p></p>
<p class="MsoNormal"><span style="mso-fareast-language:FI"><br>
<br>
<o:p></o:p></span></p>
<pre>_______________________________________________<o:p></o:p></pre>
<pre>gdal-dev mailing list<o:p></o:p></pre>
<pre><a href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a><o:p></o:p></pre>
<pre><a href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a><o:p></o:p></pre>
</blockquote>
<pre>-- <o:p></o:p></pre>
<pre><a href="http://www.spatialys.com/">http://www.spatialys.com</a><o:p></o:p></pre>
<pre>My software is free, but my time generally not.<o:p></o:p></pre>
<pre>Butcher of all kinds of standards, open or closed formats. At the end, this is just about bytes.<o:p></o:p></pre>
</div>
</body>
</html>