[gdal-dev] Odd issue with parquet using gdal python opening inside Django code
Even Rouault
even.rouault at spatialys.com
Tue Jan 13 16:33:24 PST 2026
Mike,
any chance you can provide such sample file? From the logs it uses
geoarrow.polygon encoding, which should be fine in theory. My hypothesis
would be that in the Django context something registers the
geoarrow.polygon extension to libarrow (pyarrow maybe). That said the
Parquet driver is supposed to be robust to that, so there must be some
subtelty. When rewriting the file, WKB encoding is selected, which works
around any potential conflict related to a geoarrow.polygon extension
being loaded
Even
Le 14/01/2026 à 01:03, Michael Smith via gdal-dev a écrit :
> We use gdal with Django and build gdal from conda and install the parquet driver that way. But we have just found an odd issue when a parquet file is opened in our Django code vs in a Django shell.
>
>
> In the shell we get
>
>
> gf = gdal.OpenEx('/tmp/stac_48069_usgs-3dep_rasters.parquet')
> GDAL: On-demand registering /home/gridusr/gridpixi/.pixi/envs/default/lib/gdalplugins/ogr_Parquet.so using RegisterOGRParquet.
> GDAL: GDALOpen(/tmp/stac_48069_usgs-3dep_rasters.parquet, this=0x5626a2e395a0) succeeds as Parquet.
> PARQUET: Compression (of first column): zstd
> ARROW: Memory pool: bytes_allocated = 0
> ARROW: Memory pool: max_memory = 0
> GDAL: GDALClose(/vsis3/grid-dev-publiclidar/stac/testgrid/rasters/stac_48069_usgs-3dep_rasters.parquet, this=0x5626a2e38990)
>
>
> And everything works fine
>
>
> In the code we get
> GDAL: On-demand registering /home/gridusr/gridpixi/.pixi/envs/default/lib/gdalplugins/ogr_Parquet.so using RegisterOGRParquet.
> GDAL: GDALOpen(/tmp/stac_48069_usgs-3dep_rasters.parquet, this=0x7f43945696b0) succeeds as Parquet.
> PARQUET: Dealing with field GEOMETRY of extension type geoarrow.polygon as list<rings: list<vertices: struct<x: double not null, y: double not null> not null> not null>
> PARQUET: Compression (of first column): zstd
> GDAL: GDALOpen(/tmp/stac_48069_usgs-3dep_rasters.parquet, this=0x7f43945d6760) succeeds as Parquet.
> OGR: GetLayerCount() = 1
>
>
> And this line seems to be the issue
> PARQUET: Dealing with field GEOMETRY of extension type geoarrow.polygon as list<rings: list<vertices: struct<x: double not null, y: double not null> not null> not null>
>
>
> In this case, the geometry is not recognized.
> Calling gdal.VectorInfo(gf) I see
>
>
> INFO: Open of `/tmp/stac_48069_usgs-3dep_rasters.parquet'
> using driver `Parquet' successful.
>
>
> Layer name: stac_48069_usgs-3dep_rasters
> Geometry: None
> Feature Count: 109314
> Layer SRS WKT:
> (unknown)
> GEOMETRY: String(JSON) (0.0)
>
>
> In the django python shell we get
> In [7]: print (res)
> INFO: Open of `/tmp/stac_48069_usgs-3dep_rasters.parquet'
> using driver `Parquet' successful.
>
>
> Layer name: stac_48069_usgs-3dep_rasters
> Geometry: Polygon
> Feature Count: 109314
> Extent: (-179.001667, -15.001667) - (180.000000, 84.001667)
> Layer SRS WKT:
> (unknown)
> Geometry Column = GEOMETRY
>
>
> Any ideas?
>
>
>
>
--
http://www.spatialys.com
My software is free, but my time generally not.
More information about the gdal-dev
mailing list