[GRASS-dev] r.statistics limitation to CELL

Glynn Clements glynn at gclements.plus.com
Sat Jun 2 08:00:12 EDT 2007


Markus Neteler wrote:

> would it be much work to fix this:
> 
> GRASS 6.3.cvs (nc_spm_05):~ > r.statistics base=landuse96_28m \
>                cover=elevation out=elevstats_avg method=average
> ERROR: This module currently only works for integer (CELL) maps
> 
> Rounding elevation to CELL first isn't a great option.

1. r.statistics works by reclassing the base map, so the base map
can't be FP.

2. r.statistics uses r.stats to calculate the statistics, and r.stats
reads its inputs as CELL.

r.stats is inherently based upon discrete categories. Even if it reads
FP maps as FP, you would need to quantise the values, otherwise you
would end up with every cell as its own category with count == 1. This
would require memory proportional to the size of the input map
multiplied by a significant factor (48-64 bytes per cell, or even
more).

To handle FP data, you really need a completely new approach which
computes aggregates incrementally, using an accumulator. This would
limit it to aggregates which can be computed that way, e.g. count,
sum, mean, variance and standard deviation.

[The last two would need to either use the one-pass algorithm (which
can result in negative variance for near-constant data due to rounding
error), or use two passes (computing the mean in the first pass so
that the actual deviations can be used in the second pass). See also:
the history of r.univar.]

As I've mentioned several times before, computing quantiles (e.g. 
median) of large amounts of floating-point data is an open-ended
problem; any given approach has both pros and cons.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-dev mailing list