[QGIS-Developer] Probable serious problem with rasters statistics calculated by QGIS with estimated option

Pedro Venâncio pedrongvenancio at gmail.com
Mon Jul 30 07:38:23 PDT 2018


Hi all,

I think the problem described in these tickets can be more serious than
just a visualization question:

https://issues.qgis.org/issues/11974
https://issues.qgis.org/issues/14853
https://issues.qgis.org/issues/14835


Please try the following workflow.

1) Download this sample raster:
https://cld.pt/dl/download/de81b6ce-38a8-4539-be16-24a3e3ef72d1/perigosidade_int.tif

2) Running gdalinfo, the output is:

gdalinfo perigosidade_int.tif

Driver: GTiff/GeoTIFF
Files: perigosidade_int.tif
Size is 1039, 1701
Coordinate System is:
PROJCS["ETRS89 / Portugal TM06",
(...)
Origin = (73626.458811498901923,145193.972505581419682)
Pixel Size = (25.002694985400002,-25.002694985400002)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (   73626.459,  145193.973) (  7d15'30.17"W, 40d58'21.05"N)
Lower Left  (   73626.459,  102664.388) (  7d15'48.20"W, 40d35'22.48"N)
Upper Right (   99604.259,  145193.973) (  6d56'59.28"W, 40d58'11.13"N)
Lower Right (   99604.259,  102664.388) (  6d57'23.68"W, 40d35'12.70"N)
Center      (   86615.359,  123929.180) (  7d 6'25.36"W, 40d46'47.22"N)
Band 1 Block=1039x3 Type=Int16, ColorInterp=Gray
  NoData Value=32767


3) Running gdalinfo with stats option, the output is:

gdalinfo -stats perigosidade_int.tif

Driver: GTiff/GeoTIFF
Files: perigosidade_int.tif
Size is 1039, 1701
Coordinate System is:
PROJCS["ETRS89 / Portugal TM06",
(...)
Origin = (73626.458811498901923,145193.972505581419682)
Pixel Size = (25.002694985400002,-25.002694985400002)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (   73626.459,  145193.973) (  7d15'30.17"W, 40d58'21.05"N)
Lower Left  (   73626.459,  102664.388) (  7d15'48.20"W, 40d35'22.48"N)
Upper Right (   99604.259,  145193.973) (  6d56'59.28"W, 40d58'11.13"N)
Lower Right (   99604.259,  102664.388) (  6d57'23.68"W, 40d35'12.70"N)
Center      (   86615.359,  123929.180) (  7d 6'25.36"W, 40d46'47.22"N)
Band 1 Block=1039x3 Type=Int16, ColorInterp=Gray
  Minimum=2.000, Maximum=835.000, Mean=37.025, StdDev=71.526
  NoData Value=32767
  Metadata:
    STATISTICS_MAXIMUM=835
    STATISTICS_MEAN=37.024768992184
    STATISTICS_MINIMUM=2
    STATISTICS_STDDEV=71.52569727799


4) gdalinfo creates a XML file with statistics info in the file folder:

<PAMDataset>
  <PAMRasterBand band="1">
    <Metadata>
      <MDI key="STATISTICS_MAXIMUM">835</MDI>
      <MDI key="STATISTICS_MEAN">37.024768992184</MDI>
      <MDI key="STATISTICS_MINIMUM">2</MDI>
      <MDI key="STATISTICS_STDDEV">71.52569727799</MDI>
    </Metadata>
  </PAMRasterBand>
</PAMDataset>


5) Then, delete that XML file.

6) Load the perigosidade_int.tif file in QGIS.

7) Don't do anything, just remove the raster from QGIS TOC.

8) If you go to the folder where the raster is stored, QGIS created a new
XML file with:

<PAMDataset>
  <PAMRasterBand band="1">
    <Histograms>
      <HistItem>
        <HistMin>1.500685871056241</HistMin>
        <HistMax>730.4993141289438</HistMax>
        <BucketCount>729</BucketCount>
        <IncludeOutOfRange>0</IncludeOutOfRange>
        <Approximate>1</Approximate>
        <HistCounts>(...)</HistCounts>
      </HistItem>
    </Histograms>
    <Metadata>
      <MDI key="STATISTICS_MAXIMUM">730</MDI>
      <MDI key="STATISTICS_MEAN">34.375557670573</MDI>
      <MDI key="STATISTICS_MINIMUM">2</MDI>
      <MDI key="STATISTICS_STDDEV">67.853826109685</MDI>
    </Metadata>
  </PAMRasterBand>
</PAMDataset>


9) So, QGIS marks the STATISTICS_MAXIMUM as 730.

10) Even defining in Settings -> Options -> Rendering -> Rasters -> Limits
(minimum/maximum): Minimum/Maximum

QGIS always uses 730, because the Accuracy is, by default, "Estimate
(faster)", and this option is not configurable at Options, as said in these
tickets.


But what is really serious here, is that as soon as the XML file is
created, Processing tools uses these statistics for the calculations.

11) For instance, if run r.quantile with "generate recode rules" flag, the
output is:

2.000000:6.000000:1
6.000000:8.000000:2
8.000000:12.000000:3
12.000000:20.000000:4
20.000000:730.000000:5

12) Classifying the raster based on these rules with r.recode leaves pixels
with value 835 as NULL.

And every subsequent process gives errors.


The question is.. how I never had realized this problem? The fact is that
QGIS only generates the XML file when the raster layer is removed from QGIS
project. And so, if Processing modules are run using the layers loaded in
TOC, there are no problems.

If these modules are runned from the python console (processing.runalg) or
using raster layers that are not loaded in QGIS, the problem show up.

I'm with QGIS 2.18.22 (OSGeo4W64 bits).

Thank you very much!

Best regards,
Pedro Venâncio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/qgis-developer/attachments/20180730/e67c8c77/attachment-0001.html>


More information about the QGIS-Developer mailing list