[gdal-dev] gdalinfo -mm also report n (number of grid cells that are not nodata)

Markus Metz markus.metz.giswork at gmail.com
Sat Jun 16 11:52:39 PDT 2018


On Fri, Jun 15, 2018 at 10:43 PM, Even Rouault <even.rouault at spatialys.com>
wrote:
>
> > Thinking about it, I do not want to support approximate statistics,
> > therefore something like STATISTICS_VALID_RATIO does not work for me,
only
> > something like STATISTICS_N_VALID which requires exact statistics.
>
> STATISTICS_VALID_RATIO makes more sense to me that absolute number of
pixels.

OK, considering that approximate statistics need to be supported, something
like STATISTICS_VALID_RATIO is the only option.

Setting such a metadata item would be relatively easy to implement.

>
> > Approximate statistics are confusing for users, unless it is made clear
> > that these statistics are approximations.
>
> It is know, since STATISTICS_APPROXIMATE=YES is now set if you compute
> approximate statistics.
>
> > Looking at random samples, the normal assumption must be
> > STATISTICS_APPROXIMATE=YES if STATISTICS_APPROXIMATE is not set. IMHO,
GDAL
> > should set STATISTICS_APPROXIMATE=YES unless GDAL itself has computed
exact
> > statistics.
>
> That's what GDAL 2.3.0 now does. Check the output of gdalinfo -stats vs
> gdalinfo -approx_stats.

I checked, results with gdalinfo -stats are wrong because existing
STATISTICS_* metadata are reported even if approximate statistics are not
allowed. The problem is, STATISTICS_APPROXIMATE is not set. Other software
using GDAL to create raster datasets may use
GDALRasterBand::SetStatistics() which does not indicate if stats are
approximations., i.e. stats are approximations but there is no
STATISTICS_APPROXIMATE=YES.

GDAL assumes that STATISTICS_* metadata represent stats on all pixels, this
is IMHO wrong. You can only hope that STATISTICS_* metadata represent stats
on all pixels if a respective metadata item has been set to boolean true,
something like STATISTICS_ALL_PIXELS=YES. Even in this case, an option to
force recomputing raster band stats would be very nice to have (verifying
metadata).

STATISTICS_EXACT is not an option because there are different ways to
calculate mean and stddev using a fixed set of values. The different
methods are all correct (exact) in their own way, but results may be
different.

Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180616/ab0fbb5d/attachment-0001.html>


More information about the gdal-dev mailing list