[gdal-dev] Large GPKG files in S3

Tim Harris trharris78 at gmail.com
Wed Sep 24 16:38:58 PDT 2025


Hi, trying to do some queries of fairly large GPKG files sitting in S3 in
the hopes that spatial indexes and S3 range requests make it fast and
efficient. But it seems that's not the case and I'm wondering if I'm doing
something wrong.

The examples below are just looking for a geometry at a point. The filename
is fake but the geoms.gpkg contains a layer "geoms" with a bunch of
polygons. It's about 1.7 GB in size.

Something like this runs relatively fast, it makes about 17 HTTP requests
and outputs the feature:

---
ogrinfo -ro -json -features /vsis3/bucket/geoms.gpkg geoms -spat -105 40
-105 40
---

But this, using -sql instead of -spat, sits forever without finishing. With
CPL_CURL_VERBOSE it sort of looks like it's just scanning through the file
a range request at a time:

---
ogrinfo -ro -json -features /vsis3/bucket/geoms.gpkg geoms -sql "select *
from geoms where ST_Intersects(geom, ST_GeomFromText('POINT(-105 40)',
4326)) = 1"
---

I also tried replacing the ST_GeomFromText with other things
like MakePoint(-105, 40, 4326) and ST_EnvIntersects(geom, -105, 40, -105,
40), just in case. It seems like -spat uses the spatial index efficiently
but -sql doesn't. I also tried using them both, but having the -sql there
seems to make it slow.

I also tried to add "explain query plan" to the sql statement:

---
ogrinfo -ro /vsis3/prod-tilerepo-va2/metadata/20250920T041443/geoms.gpkg
geoms -sql "EXPLAIN QUERY PLAN select * from geoms where
ST_Intersects(geom, ST_GeomFromText('POINT(-105 40)', 4326)) = 1"
---

That only sort of confirms the scan, and reports "SCAN geoms" in the
output. This page, https://sqlite.org/eqp.html, explains the "explain query
plan" output and I was hoping to see a "SEARCH geoms USING INDEX ___" of
some sort.

Is there something about the use of -sql that precludes it from using the
spatial index? Maybe we need to do an ogr2ogr with just -spat to narrow it
down, then apply a -sql to that result if we need additional filtering?

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20250924/f22d61d8/attachment.htm>


More information about the gdal-dev mailing list