[gdal-dev] Fwd: gdal_calc.py not saving correct statistics (min/max) to new rasters in some cases?

Patrick Sunter patdevelop at gmail.com
Wed Jun 18 21:45:18 PDT 2014


Hi there,

I've been using gdal_calc.py as part of a workflow where I need to compute
the average of several raster files and save as a new output. (Using a
binary build of GDAL, v 1.10.0, on Mac Os X).

It is a fairly standard calc operation (e.g. "(A+B+C)/3") but because the
original data are GeoTiff's storing data in a Byte format, I first create
VRT files referencing these with data type set to be Float32, then perform
the calc operation with these VRT files as input. This is to prevent
'overflow' problems in the midst of the averaging operation given the size
limit of the Byte format (see
http://gis.stackexchange.com/questions/33152/why-do-i-get-different-results-with-gdal-calc-within-the-osgeo4w-shell-and-qgis
)

The output GeoTiff created by gdal_calc.py seems to be have correct values
at every point - but the saved statistics metadata are wrong -
specifically, the minimum value.

An example is doing a gdalinfo -stats on one of these files, I get the
following:
  Min=21.000 Max=128.000
  Minimum=21.000, Maximum=128.000, Mean=81.692, StdDev=29.397

but I know the minimum value should be 0, and this is confirmed by
inspection in QGIS. I also wrote a script to open up the raster in Python
and run the following queries:
print "GetStatistics(): " + str(band_in.GetStatistics(True, True))
print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())
print "Hand calc. min: " + str(min(map(min, raster_values)))

I get:
GetStatistics(): [21.0, 128.0, 81.692344139650999, 29.397484687917]
ComputeRasterMinMax(): (0.0, 120.0)
Hand calc. min: 0

So you can see that ComputeRasterMinMax() is returning the right values, as
well as my manual calc using Python's min function.

As a user I presume its reasonable for the output of a   gdal_calc.py to
have latest stats stored correctly on any output it produces? So what is
going wrong?

Maybe its something to do with doing a calc on a set of VRT files using a
different type to the original. This is also a case where the output file
already exists.

Looking at the gdal_calc.py I guess a simple fix is to force an update of
statistic values using ComputeRasterMinMax() before closing the newly
created file, but maybe there is a better solution.

cheers, Patrick.

BTW example gdal_calc.py command:
gdal_calc.py -A
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_45_00-BZEV1.vrt
-B
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_50_00-BZEV1.vrt
-C
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_55_00-BZEV1.vrt
-D
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1.vrt
-E
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_05_00-BZEV1.vrt
-F
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_10_00-BZEV1.vrt
-G
./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_15_00-BZEV1.vrt
--outfile=./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1-avg7.tiff
--calc="(A+B+C+D+E+F+G)/7" --type=Byte --overwrite

And full testing script on output:
#!/usr/bin/env python
from osgeo import ogr
from osgeo import gdal

in_file="./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/CHADSTONE-22_00_00-BZEV1-avg7.tiff"

ds_in = gdal.Open(in_file)
band_in = ds_in.GetRasterBand(1)
xsize_in = band_in.XSize
ysize_in = band_in.YSize
raster_values = band_in.ReadAsArray(0, 0, xsize_in, ysize_in)

print "GetStatistics(): " + str(band_in.GetStatistics(True, True))
print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())
print "Hand calc. min: " + str(min(map(min, raster_values)))

I can provide some sample input if needed for testing too.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140619/668e776d/attachment-0001.html>


More information about the gdal-dev mailing list