[gdal-dev] Generating data-only footprints of imagery

Matt.Wilkie at yukon.ca Matt.Wilkie at yukon.ca
Wed Aug 18 14:08:51 PDT 2021


Hi Folks,



The standard image footprint mechanism such as that produced by gdaltindex draws a box around the raster but that often includes large nodata areas which we don't care about. Generating polygons of only the data carrying area is something I've often wished for but not devoted the time and attention to solving. This summer I'm very happy to have finally arrived at repeatable solution to the challenge.



I've posted the below in answer to "Creating shapefile showing footprints of Rasters?<https://gis.stackexchange.com/questions/26893/creating-shapefile-showing-footprints-of-rasters/>" on GIS Stack Exchange. After some more refinement I'll also post some of the scripts which I built as part of a larger processing chain. (No promises on how long that might be. I have to work it around my main duties.)



Ensure images have defined nodata. If source images do not, fix with something like one of the below, where 0 or 255 is the supposed-to-be nodata value:

gdal_translate ... -a_nodata 0 ... outimage.vrt

gdal_edit -a_nodata 255 somefile.tif

Create small preview images for faster processing (useful when dealing with multi GB satellite images). Skip this step if you want to match value pixels' boundary precisely. I find using the target resolution -tr parameter or choosing an outsize percentage that approximates 2048 rows/columns works well. Three approaches shown, just use one.

gdal_translate -tr 185 185 vendor_image.tif myimage.tif

gdal_translate -outsize 5% 5% vendor_image.tif myimage.tif

gdal_translate -outsize 2048 0 vendor_image.tif myimage.tif

Force pixel values to range between 1 and 255, then rescale again so that data = 100 and all else are nodata. Use vrt<https://gis.stackexchange.com/questions/tagged/vrt> to not conserve storage and be much faster also. The reason for scaling twice is to work around the problem of not knowing what range the values might be when we start. There's no meaning to 100 other than by default it draws as a midtone grey, in contrast to the traditional 1 which is visually indistinguishable from black.

gdal_translate -scale -ot byte myimage.tif myimage_8bit.vrt

gdal_translate -scale 1 255 100 100 myimage_8bit.vrt myimage_data_mask.vrt

Create a polygon of the data-only area image mask for all connected regions of pixels in the raster sharing a common pixel value, using 8-way connectedness.:

gdal_polygonize -8 myimage_data_mask.vrt .\footprints\myimage_data.shp



Enjoy! Pink is the normal extents area as created by gdaltindex and similar approaches. Black outline is our new data-only polygon.

Image: https://i.stack.imgur.com/Qv1im.png <https://i.stack.imgur.com/Qv1im.png>

Windows CMD shell bulk processing example

cd .\previews

md .\byte

md .\data-only

for %a in (*.tif) do gdal_translate -scale -ot byte %a .\byte\%~na.vrt

for %a in (byte\*.vrt) do gdal_translate -scale 1 255 100 100 %a .\data-only\%~na.vrt

for %a in (data-only\*.vrt) do call gdal_polygonize -8 %a %~na.shp

Assemble the footprints you've created by the hundred into a single overview index. {AUTO_NAME} populates a field with the name of the source file for that polygon. See ogrmerge doc page<https://gdal.org/programs/ogrmerge.html> for other options.

ogrmerge -single ^

  -src_layer_field_name Image_Name ^

  -src_layer_field_content {AUTO_NAME} ^

  -o ..\index_footprint_data.shp ^

  .\data-only\*.shp

Image:<https://i.stack.imgur.com/DbaQU.png> https://i.stack.imgur.com/DbaQU.png

________________________________

Thank you https://gis.stackexchange.com/a/139045/108 for rescaling data-only to single pixel value tip. There are other ways of reaching the same end using gdal_calc but I find this scale approach is easier to wrap my head around.

I think I learned about rescaling without knowing the input values ahead of time from Robert Simmon but have lost the reference. You can't go wrong by consulting his General Introduction to GDAL<https://medium.com/planet-stories/a-gentle-introduction-to-gdal-part-4-working-with-satellite-data-d3835b5e2971> series however.





Hopefully this will be useful for others also,

cheers.


Matt Wilkie
Geomatics Developer & Administrator
Environment | Technology, Innovation and Mapping
T 867-667-8133 | Yukon.ca<http://yukon.ca/>
Hours: 08:30-16:30, Mon-Wed: Office, Thu: Remote, Fri: Away.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20210818/a32b68f3/attachment.html>


More information about the gdal-dev mailing list