[gdal-dev] Geometry box columns and ADBC vs PARQUET

Michael Smith michael.smith.erdc at gmail.com
Sun Jul 27 14:29:34 PDT 2025


Is there a reason that the geometry bboxes are not exposed via the PARQUET driver but are via ADBC?

ADBC:
Geometry Column = geometry
geometry_bbox.xmin: Real(Float32) (0.0)
geometry_bbox.ymin: Real(Float32) (0.0)
geometry_bbox.xmax: Real(Float32) (0.0)
geometry_bbox.ymax: Real(Float32) (0.0)

PARQUET:
Geometry Column = geometry

Adding bbox attribute filtering in addition to spatial filtering makes queries much faster:

gf = gdal.OpenEx("PARQUET:/vsis3/bucket/stac/mds/rasters/") 
layer = gf.GetLayer() 
layer.SetSpatialFilter(ogr.CreateGeometryFromWkb(aoi.clip_geometry.wkb)) 
%time feats = [feat for feat in layer] 
CPU times: user 1.74 s, sys: 3.44 s, total: 5.18 s 
Wall time: 5.35 s


gf = gdal.OpenEx("ADBC:", open_options=['ADBC_DRIVER=libduckdb', 'PRELUDE_STATEMENTS=LOAD SPATIAL', 'PRELUDE_STATEMENTS=load httpfs', 'PRELUDE_STATEMENTS=load aws', 'PRELUDE_STATEMENTS=CREATE SECRET (TYPE S3,PROVIDER CREDENTIAL_CHAIN)']) 
%time layer = gf.ExecuteSQL(f"select * from read_parquet('s3://grid-dev-publiclidar/stac/mds/rasters/*') where st_intersects(geometry, ST_GeomFromText('{aoi.clip_geometry.wkt}')) ⋮ and bbox.xmin between -69 and -64 and bbox.ymin between 17 and 19") 
CPU times: user 898 ms, sys: 154 ms, total: 1.05 s 
Wall time: 730 ms
%time feats = [feat for feat in layer]
CPU times: user 192 ms, sys: 43.4 ms, total: 235 ms
Wall time: 237 ms


-- 

Michael Smith 
RSGIS Center – ERDC CRREL NH 
US Army Corps 








More information about the gdal-dev mailing list