<div dir="ltr"><div>No comments on the below anyone?<br><br>It does seem like a genuine bug that gdal_calc.py can produce output with statistics not up to date? <br><br>Should I create an issue in the issue tracker?<br><br>
</div>-- Patrick.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jun 19, 2014 at 2:45 PM, Patrick Sunter <span dir="ltr"><<a href="mailto:patdevelop@gmail.com" target="_blank">patdevelop@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi there,<br><div class="gmail_quote"><div dir="ltr"><div><div class=""><div><div><br></div>I've been using gdal_calc.py as part of a workflow where I need to compute the average of several raster files and save as a new output. (Using a binary build of GDAL, v 1.10.0, on Mac Os X).<br>
<br></div></div><div>It is a fairly standard calc operation (e.g. "(A+B+C)/3") but because the original data are GeoTiff's storing data in a Byte format, I first create VRT files referencing these with data type set to be Float32, then perform the calc operation with these VRT files as input. This is to prevent 'overflow' problems in the midst of the averaging operation given the size limit of the Byte format (see <a href="http://gis.stackexchange.com/questions/33152/why-do-i-get-different-results-with-gdal-calc-within-the-osgeo4w-shell-and-qgis" target="_blank">http://gis.stackexchange.com/questions/33152/why-do-i-get-different-results-with-gdal-calc-within-the-osgeo4w-shell-and-qgis</a>)<br>
</div><div><br></div>The output GeoTiff created by gdal_calc.py seems to be have correct values at every point - but the saved statistics metadata are wrong - specifically, the minimum value.<br><br></div><div class="">An example is doing a gdalinfo -stats on one of these files, I get the following:<br>
Min=21.000 Max=128.000 <br>
Minimum=21.000, Maximum=128.000, Mean=81.692, StdDev=29.397<br><div><br></div></div><div>but I know the minimum value should be 0, and this is confirmed by inspection in QGIS. I also wrote a script to open up the raster in Python and run the following queries:<div class="">
<br>
print "GetStatistics(): " + str(band_in.GetStatistics(True, True))<br>print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())<br>print "Hand calc. min: " + str(min(map(min, raster_values)))<br>
<br></div></div><div class=""><div>I get:<br>GetStatistics(): [21.0, 128.0, 81.692344139650999, 29.397484687917]<br>ComputeRasterMinMax(): (0.0, 120.0)<br>Hand calc. min: 0<br><br></div></div><div>So you can see that ComputeRasterMinMax() is returning the right values, as well as my manual calc using Python's min function.<br>
</div><div><br></div><div>As a user I presume its reasonable for the output of a gdal_calc.py to have latest stats stored correctly on any output it produces? So what is going wrong?<br>
<br></div><div class=""><div>Maybe its something to do with doing a calc on a set of VRT files using a different type to the original. This is also a case where the output file already exists.<br><br></div></div><div>Looking at the gdal_calc.py I guess a simple fix is to force an update of statistic values using ComputeRasterMinMax() before closing the newly created file, but maybe there is a better solution.<br>
</div><div class=""><div><br></div><div>cheers, Patrick.<br>
<br></div><div>BTW example gdal_calc.py command:<br>gdal_calc.py -A ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_45_00-BZEV1.vrt -B ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_50_00-BZEV1.vrt -C ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-21_55_00-BZEV1.vrt -D ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1.vrt -E ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_05_00-BZEV1.vrt -F ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_10_00-BZEV1.vrt -G ./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_15_00-BZEV1.vrt --outfile=./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/SURREY_HILLS-22_00_00-BZEV1-avg7.tiff --calc="(A+B+C+D+E+F+G)/7" --type=Byte --overwrite<br>
<br></div><div>And full testing script on output:<br>#!/usr/bin/env python<br>from osgeo import ogr<br>from osgeo import gdal<br><br>in_file="./Comparison-PTV-BZE-V1-201406-ForPres-30minWindows-3eachSide/MelbTrainTramBus-BZE-autostops-withcongestion-WithMotorways-updated-20140514/CHADSTONE-22_00_00-BZEV1-avg7.tiff"<br>
<br>ds_in = gdal.Open(in_file)<br>band_in = ds_in.GetRasterBand(1)<br>xsize_in = band_in.XSize<br>ysize_in = band_in.YSize<br>raster_values = band_in.ReadAsArray(0, 0, xsize_in, ysize_in)<br><br>print "GetStatistics(): " + str(band_in.GetStatistics(True, True))<br>
print "ComputeRasterMinMax(): " + str(band_in.ComputeRasterMinMax())<br>print "Hand calc. min: " + str(min(map(min, raster_values)))<br><br></div><div>I can provide some sample input if needed for testing too.<br>
</div></div></div>
</div><br></div>
</blockquote></div><br></div>