[gdal-dev] gdal parquet and hive partitioning

Michael Smith michael.smith.erdc at gmail.com
Sun Dec 28 04:36:47 PST 2025


Meant to include: 
gdal --version
GDAL 3.12.0 "Chicoutimi", released 2025/11/03



On 12/28/25, 7:26 AM, "Michael Smith" <michael.smith.erdc at gmail.com <mailto:michael.smith.erdc at gmail.com>> wrote:


I know that gdal can write parquet data with hive partitioning using gdal vector partition, but after doing so, can gdal do the partition elimination on reading when a where/attribute is specified on the partition key?


I was trying to do a pipeline with:
gdal vector pipeline ! read "/vsis3/bucket/overture/20251217/overture-buildings/” ! filter --bbox -117.486117584442,33.9156194185775,-117.333055544584,33.9745995301481 --where "country='US'" ! write -f parquet /tmp/test1.parquet --progress --overwrite 


but in CPL_DEBUG I see it scanning all the partitions rather than just querying the country=US partition. 


S3: Downloading 0-1605631 (https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAI/data_0.parquet <https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAI/data_0.parquet>)...
S3: Got response_code=206
S3: Downloading 0-16383999 (https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_2.parquet <https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_2.parquet>)...
S3: Got response_code=206
S3: Downloading 0-16383999 (https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_3.parquet <https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_3.parquet>)...
S3: Got response_code=206
S3: Downloading 16384000-32767999 (https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_2.parquet <https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_2.parquet>)...
S3: Got response_code=206
S3: Downloading 16384000-29741378 (https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_3.parquet <https://bucket.s3.us-east-1.amazonaws.com/overture/20251217/overture-buildings/country%3DAL/data_3.parquet>)...
....






-- 


Michael Smith 
RSGIS Center – ERDC CRREL NH 
US Army Corps 














More information about the gdal-dev mailing list