[gdal-dev] Extracting data from a parquet file

Even Rouault even.rouault at spatialys.com
Mon Jul 22 12:29:35 PDT 2024


Le 22/07/2024 à 21:10, Joaquim Manuel Freire Luís a écrit :
>
> Even,
>
> Thanks for the explanation. But how did you find the name of the 
> geometries (geo_point_2D and geo_shape)? Loading the 
> “world-administrative-boundaries.parquet” in a binary editor I can see 
> them there, but that’s certainly not the way to find these things.
>
$ ogrinfo world-administrative-boundaries.parquet -al -so | grep 
"Geometry Column"
Geometry Column 1 = geo_point_2d
Geometry Column 2 = geo_shape

> Joaquim
>
> *From:*Even Rouault <even.rouault at spatialys.com>
> *Sent:* Monday, July 22, 2024 2:29 PM
> *To:* Joaquim Manuel Freire Luís <jluis at ualg.pt>; gdal-dev at lists.osgeo.org
> *Subject:* Re: [gdal-dev] Extracting data from a parquet file
>
> Joaquim,
>
> The GeoPackage format only supports one geometry field per layer. and 
> the QGIS OGR provider doesn't know currently how to handle several 
> geometry fields per layer too
>
> To do what you want, you need to explictly select the desired geometry 
> field name with:
>
> ogr2ogr out.gpkg world-administrative-boundaries.parquet -sql "select 
> geo_shape, * from \"world-administrative-boundaries\""
>
> Actually if you outputted to a format that supports several geometry 
> fields per layer (let's say PostGIS), the above wouldn't work. You 
> would need to exclude the geometry fields from the wildcard * 
> selection with:
>
> ogr2ogr out.gpkg world-administrative-boundaries.parquet -sql "select 
> geo_shape, * exclude (geo_point_2D, geo_shape) from 
> \"world-administrative-boundaries\""
>
> Even
>
> Le 19/07/2024 à 16:58, Joaquim Manuel Freire Luís via gdal-dev a écrit :
>
>     Hi,
>
>     I finally managed to build a working GDAL with the arrow/parquet
>     driver and I’m now trying to convert this file
>
>     (https://public.opendatasoft.com/api/explore/v2.1/catalog/datasets/world-administrative-boundaries/exports/parquet?lang=en&timezone=Europe%2FLondon
>     <https://public.opendatasoft.com/api/explore/v2.1/catalog/datasets/world-administrative-boundaries/exports/parquet?lang=en&timezone=Europe%2FLondon>)
>
>     but can only extract the “Point”, not the “Multi polygon”
>
>     ogrinfo world-administrative-boundaries.parquet
>
>     INFO: Open of `world-administrative-boundaries.parquet'
>
>           using driver `Parquet' successful.
>
>     1: world-administrative-boundaries (Point, Multi Polygon)
>
>     This gets only the points
>
>     ogr2ogr lixo.gpkg world-administrative-boundaries.parquet
>
>     The same happens if I open the file in QGis. Points only, no polygons.
>
>     But if I do an ogrinfo -al, it prints all data in file.
>
>     ogrinfo -al world-administrative-boundaries.parquet
>
>     ….
>
>     OGRFeature(world-administrative-boundaries):255
>
>       iso3 (String) = GIB
>
>       status (String) = UK Non-Self-Governing Territory
>
>       color_code (String) = GBR
>
>       name (String) = Gibraltar
>
>>
>     So, how can we select in ogr2ogr to extract the polygons?
>
>
>
>     _______________________________________________
>
>     gdal-dev mailing list
>
>     gdal-dev at lists.osgeo.org
>
>     https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> -- 
> http://www.spatialys.com
> My software is free, but my time generally not.

-- 
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240722/58cef890/attachment-0001.htm>


More information about the gdal-dev mailing list