[gdal-dev] gdal_polygonize.py TIF to JSON performance

Graeme B. Bell grb at skogoglandskap.no
Mon Jan 19 02:24:22 PST 2015


>> Whenever you deal with national scale data for any country with coastline, you frequently end up with an absolutely gigantic and horrifically complex single polygon which depicts the coastline and all the rivers throughout the country as a single continuous edge. This mega-polygon, so often present and so often necessary, is very time-consuming for gdal_polygonise to produce and the result is very painful for every GIS geometry package to handle. 
>> 
>> It would be great if the people behind gdal_polygonise could put some thought into this extremely common situation for anyone working with country or continent scale rasters to make sure that it is handled well. It has certainly affected us a great deal when working with data at up to 2m resolution for a country larger than the UK...
>> 
> 
> This second use case is a very bad mismatch to the design of the polygonize algorithm, as you have discovered. If this is indeed a common use case (I have no basis to judge one way or the other), a very different algorithm would be far more appropriate.  In many respects, the desired algorithm would also be easier to implement if the nature of the data is basically binary - in the country/region/polygon, or out. Matters get more complicated if we allow holes, disconnected regions, or have multiple regions to identify (but a small number). 
> 
> Perhaps one of the core project members could describe how to build a case to support this need and submit an enhancement request.

Hi David, thanks for the reply.

It's probably worth noting that there are analogous situations for countries without coastline. Consider a national map raster for a landlocked country, which has been masked to remove neighbours. You could get a giant polygon surrounding the country representing the complexity of all the border as a single polygon. It's not quite so fiendish as the example I gave above, but it can happen pretty easily.

Also, while I appreciate the value of checking commonality of use cases, there is also an alternative way of viewing the issue: 

Does GDAL have the features and behaviour that are needed for national scale map work, or is it a 'town and county' type of project?

There is a hidden danger in relying on how commonly something occurs as a use case:  if a piece of software currently can't support a use case or supports it incredibly poorly, then by definition there will be few (likely zero) people using it for that purpose. 

Graeme.


More information about the gdal-dev mailing list