[GRASS-dev] r.statistics limitation to CELL

Sat Jun 2 08:05:56 EDT 2007

On Sat, Jun 02, 2007 at 01:00:12PM +0100, Glynn Clements wrote:
> 
> Markus Neteler wrote:
> 
> > would it be much work to fix this:
> > 
> > GRASS 6.3.cvs (nc_spm_05):~ > r.statistics base=landuse96_28m \
> >                cover=elevation out=elevstats_avg method=average
> > ERROR: This module currently only works for integer (CELL) maps
> > 
> > Rounding elevation to CELL first isn't a great option.
> 
> 1. r.statistics works by reclassing the base map, so the base map
> can't be FP.

In this case I meant the cover map "elevation" which is rejected.
landuse96_28m is a CELL map, elevation FCELL.

> 2. r.statistics uses r.stats to calculate the statistics, and r.stats
> reads its inputs as CELL.

In this case, r.statistics could also accept an FCELL map
without complaining? Currently I need extra steps to round
elevation to a CELL map before running r.statistics.

> r.stats is inherently based upon discrete categories. Even if it reads
> FP maps as FP, you would need to quantise the values, otherwise you
> would end up with every cell as its own category with count == 1. This
> would require memory proportional to the size of the input map
> multiplied by a significant factor (48-64 bytes per cell, or even
> more).
> 
> To handle FP data, you really need a completely new approach which
> computes aggregates incrementally, using an accumulator. This would
> limit it to aggregates which can be computed that way, e.g. count,
> sum, mean, variance and standard deviation.

I darkly remember that Soeren has something already in the works.

> [The last two would need to either use the one-pass algorithm (which
> can result in negative variance for near-constant data due to rounding
> error), or use two passes (computing the mean in the first pass so
> that the actual deviations can be used in the second pass). See also:
> the history of r.univar.]
> 
> As I've mentioned several times before, computing quantiles (e.g. 
> median) of large amounts of floating-point data is an open-ended
> problem; any given approach has both pros and cons.
> 
> -- 
> Glynn Clements <glynn at gclements.plus.com>

Markus