<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi,</p>
<p><a class="moz-txt-link-freetext" href="https://github.com/OSGeo/gdal/pull/13254">https://github.com/OSGeo/gdal/pull/13254</a> will dramatically
improve the performance for OGRVRTUnionLayer + OGR SQL with ORDER.
The reason for the current slowness is that OGR SQL with ORDER
does a full scan to collect the column value to be sorted and the
feature ID, then sort, and then use GetFeature() over the sorted
list to retrieve features by ids. In the case of a
OGRVRTUnionLayer before above PR, the implementation was
inefficient and would cause a linear scan from FID 0 to be done
for each queried feature. In the SQLite case, I believe that
SQLite must create some sort of temporary table with the content
of collected features, which doesn't require reading them back
from the source.<br>
</p>
<p>Even<br>
</p>
<div class="moz-cite-prefix">Le 20/10/2025 à 10:47, Andrey VI via
gdal-dev a écrit :<br>
</div>
<blockquote type="cite"
cite="mid:1760950050.767475941@f542.i.mail.ru">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div>Hi all.</div>
<div>
<div><br>
<span lang="en"><span><span>I have found a significant
performance drop when using OGR SQL with VRT as a
source. Test results with sample data (<a
href="https://drive.google.com/file/d/1pmwTACFtxjbNDpwuqQJXl0pbtGO6RqeJ/view?usp=sharing"
moz-do-not-send="true">link</a>, 6,8 MB).</span></span></span></div>
<div><span lang="en"><span><span>VRT as a source:</span></span></span></div>
<div>
<div> </div>
<div>$ time ogr2ogr -dialect sqlite -sql "SELECT geometry,
CAST(ABS(MAX) AS INT) AS depth FROM merged ORDER BY MAX
DESC" sample_vrt-sqlite.geojsons sample.vrt</div>
<div>real 0m3,960s</div>
<div>user 0m0,037s</div>
<div>sys 0m0,022s
<div>
<div>
<div>$ time ogr2ogr -sql "SELECT CAST(MAX * -1 AS
INTEGER) AS depth FROM merged ORDER BY MAX DESC"
sample_vrt-ogr.geojsons sample.vrt</div>
<div>real 3m4,108s</div>
<div>user 0m0,057s</div>
<div>sys 0m0,016s</div>
<div> </div>
</div>
</div>
</div>
Shapefile as a source:
<div>$ time ogr2ogr -dialect sqlite -sql "SELECT geometry,
CAST(ABS(MAX) AS INT) AS depth FROM sample ORDER BY MAX
DESC" sample_shp-sqlite.geojsons sample.shp</div>
<div>real 0m3,627s</div>
<div>user 0m0,036s</div>
<div>sys 0m0,025s</div>
<div>$ time ogr2ogr -sql "SELECT CAST(MAX * -1 AS INTEGER) AS
depth FROM sample ORDER BY MAX DESC" sample_shp-ogr.geojsons
sample.shp</div>
<div>real 0m3,731s</div>
<div>user 0m0,052s</div>
<div>sys 0m0,025s</div>
<div> </div>
<div><span lang="en"><span><span>Since my VRTs consist of tens
and hundreds of Shapefiles, OGR SQL becomes unusable
for the specific task given as an example above.</span></span></span></div>
<div><span lang="en"><span><span>So I wonder what could be the
reason for such a significant drop in performance in
this particular case? Or maybe I’m doing something
wrong?</span></span></span></div>
</div>
</div>
<div>--<br>
Andrey</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
gdal-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:gdal-dev@lists.osgeo.org">gdal-dev@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="https://lists.osgeo.org/mailman/listinfo/gdal-dev">https://lists.osgeo.org/mailman/listinfo/gdal-dev</a>
</pre>
</blockquote>
<pre class="moz-signature" cols="72">--
<a class="moz-txt-link-freetext" href="http://www.spatialys.com">http://www.spatialys.com</a>
My software is free, but my time generally not.</pre>
</body>
</html>