[gdal-dev] Generating data-only footprints of imagery
Matt.Wilkie at yukon.ca
Matt.Wilkie at yukon.ca
Wed Aug 18 14:08:51 PDT 2021
Hi Folks,
The standard image footprint mechanism such as that produced by gdaltindex draws a box around the raster but that often includes large nodata areas which we don't care about. Generating polygons of only the data carrying area is something I've often wished for but not devoted the time and attention to solving. This summer I'm very happy to have finally arrived at repeatable solution to the challenge.
I've posted the below in answer to "Creating shapefile showing footprints of Rasters?<https://gis.stackexchange.com/questions/26893/creating-shapefile-showing-footprints-of-rasters/>" on GIS Stack Exchange. After some more refinement I'll also post some of the scripts which I built as part of a larger processing chain. (No promises on how long that might be. I have to work it around my main duties.)
Ensure images have defined nodata. If source images do not, fix with something like one of the below, where 0 or 255 is the supposed-to-be nodata value:
gdal_translate ... -a_nodata 0 ... outimage.vrt
gdal_edit -a_nodata 255 somefile.tif
Create small preview images for faster processing (useful when dealing with multi GB satellite images). Skip this step if you want to match value pixels' boundary precisely. I find using the target resolution -tr parameter or choosing an outsize percentage that approximates 2048 rows/columns works well. Three approaches shown, just use one.
gdal_translate -tr 185 185 vendor_image.tif myimage.tif
gdal_translate -outsize 5% 5% vendor_image.tif myimage.tif
gdal_translate -outsize 2048 0 vendor_image.tif myimage.tif
Force pixel values to range between 1 and 255, then rescale again so that data = 100 and all else are nodata. Use vrt<https://gis.stackexchange.com/questions/tagged/vrt> to not conserve storage and be much faster also. The reason for scaling twice is to work around the problem of not knowing what range the values might be when we start. There's no meaning to 100 other than by default it draws as a midtone grey, in contrast to the traditional 1 which is visually indistinguishable from black.
gdal_translate -scale -ot byte myimage.tif myimage_8bit.vrt
gdal_translate -scale 1 255 100 100 myimage_8bit.vrt myimage_data_mask.vrt
Create a polygon of the data-only area image mask for all connected regions of pixels in the raster sharing a common pixel value, using 8-way connectedness.:
gdal_polygonize -8 myimage_data_mask.vrt .\footprints\myimage_data.shp
Enjoy! Pink is the normal extents area as created by gdaltindex and similar approaches. Black outline is our new data-only polygon.
Image: https://i.stack.imgur.com/Qv1im.png <https://i.stack.imgur.com/Qv1im.png>
Windows CMD shell bulk processing example
cd .\previews
md .\byte
md .\data-only
for %a in (*.tif) do gdal_translate -scale -ot byte %a .\byte\%~na.vrt
for %a in (byte\*.vrt) do gdal_translate -scale 1 255 100 100 %a .\data-only\%~na.vrt
for %a in (data-only\*.vrt) do call gdal_polygonize -8 %a %~na.shp
Assemble the footprints you've created by the hundred into a single overview index. {AUTO_NAME} populates a field with the name of the source file for that polygon. See ogrmerge doc page<https://gdal.org/programs/ogrmerge.html> for other options.
ogrmerge -single ^
-src_layer_field_name Image_Name ^
-src_layer_field_content {AUTO_NAME} ^
-o ..\index_footprint_data.shp ^
.\data-only\*.shp
Image:<https://i.stack.imgur.com/DbaQU.png> https://i.stack.imgur.com/DbaQU.png
________________________________
Thank you https://gis.stackexchange.com/a/139045/108 for rescaling data-only to single pixel value tip. There are other ways of reaching the same end using gdal_calc but I find this scale approach is easier to wrap my head around.
I think I learned about rescaling without knowing the input values ahead of time from Robert Simmon but have lost the reference. You can't go wrong by consulting his General Introduction to GDAL<https://medium.com/planet-stories/a-gentle-introduction-to-gdal-part-4-working-with-satellite-data-d3835b5e2971> series however.
Hopefully this will be useful for others also,
cheers.
Matt Wilkie
Geomatics Developer & Administrator
Environment | Technology, Innovation and Mapping
T 867-667-8133 | Yukon.ca<http://yukon.ca/>
Hours: 08:30-16:30, Mon-Wed: Office, Thu: Remote, Fri: Away.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20210818/a32b68f3/attachment.html>
More information about the gdal-dev
mailing list