[gdal-dev] gdal_calc.py not saving correct statistics (min/max) to new rasters in some cases?

Patrick Sunter patdevelop at gmail.com
Thu Jun 26 21:25:48 PDT 2014


No comments on the below anyone?

It does seem like a genuine bug that gdal_calc.py can produce output with
statistics not up to date?

Should I create an issue in the issue tracker?

-- Patrick.


On Thu, Jun 19, 2014 at 2:45 PM, Patrick Sunter <patdevelop at gmail.com>
wrote:

> Hi there,
>
> I've been using gdal_calc.py as part of a workflow where I need to compute
> the average of several raster files and save as a new output. (Using a
> binary build of GDAL, v 1.10.0, on Mac Os X).
>
> It is a fairly standard calc operation (e.g. "(A+B+C)/3") but because the
> original data are GeoTiff's storing data in a Byte format, I first create
> VRT files referencing these with data type set to be Float32, then perform
> the calc operation with these VRT files as input. This is to prevent
> 'overflow' problems in the midst of the averaging operation given the size
> limit of the Byte format (see
> http://gis.stackexchange.com/questions/33152/why-do-i-get-different-results-with-gdal-calc-within-the-osgeo4w-shell-and-qgis
> )
>
> The output GeoTiff created by gdal_calc.py seems to be have correct values
> at every point - but the saved statistics metadata are wrong -
> specifically, the minimum value.
>
> An example is doing a gdalinfo -stats on one of these files, I get the
> following:
>   Min=21.000 Max=128.000
>   Minimum=21.000, Maximum=128.000, Mean=81.692, StdDev=29.397
>
> but I know the minimum value should be 0, and this is confirmed by
> inspection in QGIS. I also wrote a script to open up the raster in Python
> and run the following queries:
>
> print "GetStatistics(): " + str(band_in.GetStatistics(True, True))
> print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())
> print "Hand calc. min: " + str(min(map(min, raster_values)))
>
> I get:
> GetStatistics(): [21.0, 128.0, 81.692344139650999, 29.397484687917]
> ComputeRasterMinMax(): (0.0, 120.0)
> Hand calc. min: 0
>
> So you can see that ComputeRasterMinMax() is returning the right values,
> as well as my manual calc using Python's min function.
>
> As a user I presume its reasonable for the output of a   gdal_calc.py to
> have latest stats stored correctly on any output it produces? So what is
> going wrong?
>
> Maybe its something to do with doing a calc on a set of VRT files using a
> different type to the original. This is also a case where the output file
> already exists.
>
> Looking at the gdal_calc.py I guess a simple fix is to force an update of
> statistic values using ComputeRasterMinMax() before closing the newly
> created file, but maybe there is a better solution.
>
> cheers, Patrick.
>
> BTW example gdal_calc.py command:
> gdal_calc.py -A
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_45_00-BZEV1.vrt
> -B
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_50_00-BZEV1.vrt
> -C
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_55_00-BZEV1.vrt
> -D
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1.vrt
> -E
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_05_00-BZEV1.vrt
> -F
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_10_00-BZEV1.vrt
> -G
> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_15_00-BZEV1.vrt
> --outfile=./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1-avg7.tiff
> --calc="(A+B+C+D+E+F+G)/7" --type=Byte --overwrite
>
> And full testing script on output:
> #!/usr/bin/env python
> from osgeo import ogr
> from osgeo import gdal
>
>
> in_file="./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/CHADSTONE-22_00_00-BZEV1-avg7.tiff"
>
> ds_in = gdal.Open(in_file)
> band_in = ds_in.GetRasterBand(1)
> xsize_in = band_in.XSize
> ysize_in = band_in.YSize
> raster_values = band_in.ReadAsArray(0, 0, xsize_in, ysize_in)
>
> print "GetStatistics(): " + str(band_in.GetStatistics(True, True))
> print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())
> print "Hand calc. min: " + str(min(map(min, raster_values)))
>
> I can provide some sample input if needed for testing too.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140627/509e725e/attachment-0001.html>


More information about the gdal-dev mailing list