[gdal-dev] writing arrow geometry

Joris Van den Bossche jorisvandenbossche at gmail.com
Mon Oct 7 07:07:39 PDT 2024


The section about MultiPolygons at
https://geoarrow.org/format.html#memory-layouts mentions:

> The child name of the outer list should be “polygons”; the child name of the middle list should be “rings”; the child name of the inner list should be “vertices”.

So this is currently phrased as a "should" and not "must" regarding
the list's field names, and so the data generated by GDAL is valid
under that description.

It might be good to verify with other consumers that those indeed can
handle such data (and add some test data with varying field names to
test against).

Joris

On Mon, 7 Oct 2024 at 15:39, Even Rouault via gdal-dev
<gdal-dev at lists.osgeo.org> wrote:
>
> Michael,
>
> my understanding of https://geoarrow.org/format.html#memory-layouts is that what writes OGR is supposed to be fine since they mentionned types like 'List<List<FixedSizeList<double>[2]>>'. Perhaps I've missed something or nanoarrow has stricter expectations? CC'ing Dewey Dunnington
>
> Even
>
> Le 07/10/2024 à 15:23, Michael Sumner via gdal-dev a écrit :
>
> I realize I left out the INTERLEAVING, ie.
>
> ogr2ogr ~/fromgdal.arrow ogr/data/arrow/from_paleolimbot_geoarrow/polygon-default.ipc -lco GEOMETRY_ENCODING=GEOARROW_INTERLEAVED
>
> but still, I get these list<item elements rather than their rings/vertices/geoarrow.point type names:
>
> <nanoarrow_array_stream struct<row_num: int32, geometry: geoarrow.polygon{list<item: list<item: fixed_size_list(2)<xy: double>>>}>>
>
>
>
> On Tue, Oct 8, 2024 at 12:19 AM Michael Sumner <mdsumner at gmail.com> wrote:
>>
>> When I investigate the schema in one of the test files
>>
>> ogr/data/arrow/from_paleolimbot_geoarrow/polygon-default.ipc
>>
>> I see expected  list<polygons and list<rings and xy etc. I'm printing this by using R nanoarrow::read_arrow, or from poLayer->GetArrowStream and I get the same output:
>>
>> <nanoarrow_array_stream struct<row_num: int32, geometry: geoarrow.polygon{list<rings: list<vertices: geoarrow.point{fixed_size_list(2)<xy: double>}>>}>>
>>
>> If I write a new .arrow with GDAL
>>
>> ogr2ogr ~/fromgdal.arrow ogr/data/arrow/from_paleolimbot_geoarrow/polygon-default.ipc
>>
>> the stream schema looks like this:
>>
>> <nanoarrow_array_stream struct<row_num: int32, geometry: geoarrow.polygon{list<item: list<item: struct<x: double, y: double>>>}>>
>>
>> and from nanoarrow I see
>>
>> nanoarrow::read_nanoarrow("~/fromgdal.arrow")
>> Error in read_nanoarrow.character("~/fromgdal.arrow") :
>>   array_stream->get_schema(): [29] Expected >= 1330795077 bytes of remaining data but found 2266 bytes in buffer
>>
>> Are we in-between moves regarding specifications, or something?  I'm having good results generally and this seems like a problem in the Arrow driver for write.
>>
>> Cheers, Mike
>>
>>
>> --
>> Michael Sumner
>> Research Software Engineer
>> Australian Antarctic Division
>> Hobart, Australia
>> e-mail: mdsumner at gmail.com
>
>
>
> --
> Michael Sumner
> Research Software Engineer
> Australian Antarctic Division
> Hobart, Australia
> e-mail: mdsumner at gmail.com
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev


More information about the gdal-dev mailing list