[gdal-dev] Expected runtime of polygonize (GDAL 3.9.0) for few very large features.

Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] jesse.r.meyer at nasa.gov
Mon Jul 1 09:40:18 PDT 2024


Hi,

We’ve encountered a few images with what seems like pathological performance problems with polygonise.  The details below are a report from another developer that I haven’t yet independently verified.

We threshold a raster image to a binary mask in a memory dataset, use that as its own mask to mask out the background.
gdal.Polygonize(nn_mem_band, nn_mem_band, ogr_mem_lyr, -1)

We have a number of 32k x 32k raster images that feature number of very large same-valued regions (some as large as 80% of the entire raster).  We’re seeing ~10hrs on a modern workstation to complete the line of code above.  OpenCV can apparently construct a connected components list in mere seconds, on the same workstation and image, so we’re considering constructing the OGR geometries directly from those as a temporary work around.

Is this situation a known pitfall with the current algorithm / data structures behind Polygonize?

I’m able to share the problematic tile(s) if of interest,
Best
Jesse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240701/5a5b3a43/attachment.htm>


More information about the gdal-dev mailing list