[gdal-dev] gdal_calc.py not saving correct statistics (min/max) to new rasters in some cases?

Chaitanya kumar CH chaitanya.ch at gmail.com
Fri Jun 27 01:12:20 PDT 2014


Patrick,
Can you send me the sample data?

--
Best regards,
Chaitanya Kumar CH
On 27-Jun-2014 9:56 am, "Patrick Sunter" <patdevelop at gmail.com> wrote:

> No comments on the below anyone?
>
> It does seem like a genuine bug that gdal_calc.py can produce output with
> statistics not up to date?
>
> Should I create an issue in the issue tracker?
>
> -- Patrick.
>
>
> On Thu, Jun 19, 2014 at 2:45 PM, Patrick Sunter <patdevelop at gmail.com>
> wrote:
>
>> Hi there,
>>
>> I've been using gdal_calc.py as part of a workflow where I need to
>> compute the average of several raster files and save as a new output.
>> (Using a binary build of GDAL, v 1.10.0, on Mac Os X).
>>
>> It is a fairly standard calc operation (e.g. "(A+B+C)/3") but because the
>> original data are GeoTiff's storing data in a Byte format, I first create
>> VRT files referencing these with data type set to be Float32, then perform
>> the calc operation with these VRT files as input. This is to prevent
>> 'overflow' problems in the midst of the averaging operation given the size
>> limit of the Byte format (see
>> http://gis.stackexchange.com/questions/33152/why-do-i-get-different-results-with-gdal-calc-within-the-osgeo4w-shell-and-qgis
>> )
>>
>> The output GeoTiff created by gdal_calc.py seems to be have correct
>> values at every point - but the saved statistics metadata are wrong -
>> specifically, the minimum value.
>>
>> An example is doing a gdalinfo -stats on one of these files, I get the
>> following:
>>   Min=21.000 Max=128.000
>>   Minimum=21.000, Maximum=128.000, Mean=81.692, StdDev=29.397
>>
>> but I know the minimum value should be 0, and this is confirmed by
>> inspection in QGIS. I also wrote a script to open up the raster in Python
>> and run the following queries:
>>
>> print "GetStatistics(): " + str(band_in.GetStatistics(True, True))
>> print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())
>> print "Hand calc. min: " + str(min(map(min, raster_values)))
>>
>> I get:
>> GetStatistics(): [21.0, 128.0, 81.692344139650999, 29.397484687917]
>> ComputeRasterMinMax(): (0.0, 120.0)
>> Hand calc. min: 0
>>
>> So you can see that ComputeRasterMinMax() is returning the right values,
>> as well as my manual calc using Python's min function.
>>
>> As a user I presume its reasonable for the output of a   gdal_calc.py to
>> have latest stats stored correctly on any output it produces? So what is
>> going wrong?
>>
>> Maybe its something to do with doing a calc on a set of VRT files using a
>> different type to the original. This is also a case where the output file
>> already exists.
>>
>> Looking at the gdal_calc.py I guess a simple fix is to force an update of
>> statistic values using ComputeRasterMinMax() before closing the newly
>> created file, but maybe there is a better solution.
>>
>> cheers, Patrick.
>>
>> BTW example gdal_calc.py command:
>> gdal_calc.py -A
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_45_00-BZEV1.vrt
>> -B
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_50_00-BZEV1.vrt
>> -C
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_55_00-BZEV1.vrt
>> -D
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1.vrt
>> -E
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_05_00-BZEV1.vrt
>> -F
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_10_00-BZEV1.vrt
>> -G
>> ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_15_00-BZEV1.vrt
>> --outfile=./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1-avg7.tiff
>> --calc="(A+B+C+D+E+F+G)/7" --type=Byte --overwrite
>>
>> And full testing script on output:
>> #!/usr/bin/env python
>> from osgeo import ogr
>> from osgeo import gdal
>>
>>
>> in_file="./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/CHADSTONE-22_00_00-BZEV1-avg7.tiff"
>>
>> ds_in = gdal.Open(in_file)
>> band_in = ds_in.GetRasterBand(1)
>> xsize_in = band_in.XSize
>> ysize_in = band_in.YSize
>> raster_values = band_in.ReadAsArray(0, 0, xsize_in, ysize_in)
>>
>> print "GetStatistics(): " + str(band_in.GetStatistics(True, True))
>> print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())
>> print "Hand calc. min: " + str(min(map(min, raster_values)))
>>
>> I can provide some sample input if needed for testing too.
>>
>>
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20140627/ab70854f/attachment-0001.html>


More information about the gdal-dev mailing list