[gdal-dev] Extracting data from a parquet file

Joaquim Manuel Freire Luís jluis at ualg.pt
Mon Jul 22 12:10:51 PDT 2024


Even,

Thanks for the explanation. But how did you find the name of the geometries (geo_point_2D and geo_shape)? Loading the “world-administrative-boundaries.parquet” in a binary editor I can see them there, but that’s certainly not the way to find these things.

Joaquim

From: Even Rouault <even.rouault at spatialys.com>
Sent: Monday, July 22, 2024 2:29 PM
To: Joaquim Manuel Freire Luís <jluis at ualg.pt>; gdal-dev at lists.osgeo.org
Subject: Re: [gdal-dev] Extracting data from a parquet file


Joaquim,

The GeoPackage format only supports one geometry field per layer. and the QGIS OGR provider doesn't know currently how to handle several geometry fields per layer too

To do what you want, you need to explictly select the desired geometry field name with:

ogr2ogr out.gpkg world-administrative-boundaries.parquet -sql "select geo_shape, * from \"world-administrative-boundaries\""

Actually if you outputted to a format that supports several geometry fields per layer (let's say PostGIS), the above wouldn't work. You would need to exclude the geometry fields from the wildcard * selection with:

ogr2ogr out.gpkg  world-administrative-boundaries.parquet -sql "select geo_shape, * exclude (geo_point_2D, geo_shape) from \"world-administrative-boundaries\""

Even
Le 19/07/2024 à 16:58, Joaquim Manuel Freire Luís via gdal-dev a écrit :
Hi,

I finally managed to build a working GDAL with the arrow/parquet driver and I’m now trying to convert this file
(https://public.opendatasoft.com/api/explore/v2.1/catalog/datasets/world-administrative-boundaries/exports/parquet?lang=en&timezone=Europe%2FLondon)
but can only extract the “Point”, not the “Multi polygon”

ogrinfo world-administrative-boundaries.parquet
INFO: Open of `world-administrative-boundaries.parquet'
      using driver `Parquet' successful.
1: world-administrative-boundaries (Point, Multi Polygon)

This gets only the points

ogr2ogr lixo.gpkg world-administrative-boundaries.parquet

The same happens if I open the file in QGis. Points only, no polygons.

But if I do an ogrinfo -al, it prints all data in file.

ogrinfo -al world-administrative-boundaries.parquet

….

OGRFeature(world-administrative-boundaries):255
  iso3 (String) = GIB
  status (String) = UK Non-Self-Governing Territory
  color_code (String) = GBR
  name (String) = Gibraltar
…

So, how can we select in ogr2ogr to extract the polygons?



_______________________________________________

gdal-dev mailing list

gdal-dev at lists.osgeo.org<mailto:gdal-dev at lists.osgeo.org>

https://lists.osgeo.org/mailman/listinfo/gdal-dev

--

http://www.spatialys.com

My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240722/fb32cec8/attachment.htm>


More information about the gdal-dev mailing list