[gdal-dev] ogr2ogr for downloading extracts from overturemaps

Even Rouault even.rouault at spatialys.com
Thu Oct 24 13:01:04 PDT 2024


Hi,

This has been much improved in upcoming GDAL 3.10.0 : cf in particular 
https://github.com/OSGeo/gdal/blob/15589fea354e69f606af2a856828ecd506cb87b7/NEWS.md?plain=1#L538 
. Now only the header and trailers of part-00000 are read.

That said duckdb will likely still outperform the OGR GeoParquet driver 
(GDAL 3.11 with https://github.com/OSGeo/gdal/pull/11003 will allow to 
use libduckdb)

Even

Le 24/10/2024 à 21:41, Varun Sharma via gdal-dev a écrit :
> Hello GDAL'ers ,
>
> I have made a few attempts at using ogr2ogr for getting bounding box 
> based extracts from overturemaps datasets.
>
> I am unfortunately not able to do so - something that takes duckdb or 
> overturemaps-py <https://github.com/OvertureMaps/overturemaps-py > 30s 
> or less takes forever when using ogr2ogr. overturemaps-py 
> is essentially a wrapper over pyarrow with the arrow filter 
> constructed from bbox.
>
> I suspect I am doing something wrong. The lesser probability is that 
> ogr2ogr is not the right tool for this.
>
> Attempt 1: Command at the top of the link
> ---------------------------------------------
> https://pastebin.com/bh05Kcww
>
> Attempt 2:
> ----------------------------------------------
>
> https://pastebin.com/BG3WmQ9Y
>
> From what I can tell, all row groups from each of the parquet files is 
> being loaded and checked. This is clearly not correct.
>
> Below are my libs and versions on ubuntu 20.04. All attempts are 
> within a conda environment.
>
> gdal                      3.9.2
> gcc_linux-64              12.4.0
> libarrow                  17.0.0
> libarrow-dataset          17.0.0
> libparquet                17.0.0
> zstd                      1.5.6
> libgdal-core              3.9.2
> libgdal-arrow-parquet     3.9.2
> libcurl/8.9.1
> OpenSSL/3.3.2
>
> I typically use the command line tools to test gdal/ogr's 
> functionality and performance before I can embed that functionality in 
> my own c++ app. Thus, while there are other tools, I would love to 
> understand how to do this in GDAL/OGR.
>
> Please advice !
>
> cheers,
> Varun
>
>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.
Butcher of all kinds of standards, open or closed formats. At the end, this is just about bytes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20241024/d565f6aa/attachment.htm>


More information about the gdal-dev mailing list