[gdal-dev] gdalinfo -mm also report n (number of grid cells that are not nodata)

Even Rouault even.rouault at spatialys.com
Fri Jun 15 13:43:47 PDT 2018


> Thinking about it, I do not want to support approximate statistics,
> therefore something like STATISTICS_VALID_RATIO does not work for me, only
> something like STATISTICS_N_VALID which requires exact statistics.

STATISTICS_VALID_RATIO makes more sense to me that absolute number of pixels. 
I assume you want to know if you have only 10% or 99.5% valid pixels to decide 
if you want to process the image, rather than knowning if it is 10 or 1 
million (similarly to cloudiness value that is usually given as a percentage). 
For exact statistics, both relative or absolute number are strictly 
equivalent.

The advantage of using the ratio is that it still makes sense for approximate 
statistics.

For your use case, you check if STATISTICS_APPROXIMATE=YES is present or not 
to decide if you can trust STATISTICS_VALID_RATIO

> Approximate statistics are confusing for users, unless it is made clear
> that these statistics are approximations.

It is know, since STATISTICS_APPROXIMATE=YES is now set if you compute 
approximate statistics.

> Looking at random samples, the normal assumption must be
> STATISTICS_APPROXIMATE=YES if STATISTICS_APPROXIMATE is not set. IMHO, GDAL
> should set STATISTICS_APPROXIMATE=YES unless GDAL itself has computed exact
> statistics.

That's what GDAL 2.3.0 now does. Check the output of gdalinfo -stats vs 
gdalinfo -approx_stats.

Even


-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list