[gdal-dev] Reading from (geo)parquet using mixed spatia and non-spatiall filters

Michael Smith michael.smith.erdc at gmail.com
Fri Jan 23 06:14:12 PST 2026


When you access by ID, are you also specifying the spatial/attribute filters you used previously? Keeping those in, especially the spatial filter, will make it faster. 

Mike


-- 

Michael Smith 
RSGIS Center – ERDC CRREL NH 
US Army Corps 





On 1/23/26, 9:09 AM, "Even Rouault" <even.rouault at spatialys.com <mailto:even.rouault at spatialys.com>> wrote:




Le 23/01/2026 à 08:32, Ari Jolma a écrit :
> Thanks Even,
>
> Attribute filter fid = <fid> seems fast but ID = <ID> is not fast. 


Probably because your features appear in random ID order, and thus the 
target ID you're looking for is in the range of ID values of most row 
groups, and thus lead to loading everything (whereas the OGR "fid" is 
sequential and thus it is easy to load only the row group containing 
it). You could try using the "PARQUET:your.parquet" connection string 
that will go through the arrow dataset API whose filtering logic 
possibly uses page statistics (which the "your.parquet" file syntax 
don't do), but this might not help a lot. There's no indices in Parquet.


-- 
http://www.spatialys.com <http://www.spatialys.com>
My software is free, but my time generally not.








More information about the gdal-dev mailing list