[GRASS5] [bug #2380] (grass) unexpected '(' in r.univar

Hamish hamish_nospam at yahoo.com
Wed Apr 14 23:19:26 EDT 2004


> > Yesterday I put together a small module written in C for calculating
> > the stats on the non-null cells of a raster map, filling pretty much
> > the same roll as r.univar (which I hadn't met before). I had been
> > doing something like 'r.to.sites | s.univar', which doesn't work for
> > 5.7 of course (which is why I wrote it).
> > 
> > Having r.univar makes this redundant of course, but I'll test them
> > to see how much of a speed difference there is (I take it the
> > r.univar script will spend a bit of time doing disk I/O). [result: C
> > is at least 40% faster than shell script, and doesn't write 10s of
> > MBs files to /tmp]
> 
> I'm hoping for years to see r.univar implemented in C...

Ok, it's cleaned up now & seems to be running fine.


> > Two comments arise:  (without good answers)
> > 
> > 
> > a) Population vs. sample variance (& standard deviation)
> > 
> > r.series and r.univar use sum((xi-mean(x))^2)/n
> >    (i.e. population variance aka "sigma^2")
> > 
> > while 
> > 
> > s.univar and s.cellstats use sum((xi-mean(x))^2)/(n-1)
> >    (i.e. sample or bias-corrected variance aka "s^2")
> > 
> > 
> > For consistency we should pick one way & document it.
> 
> Yes: and move into gmath. Please use doxygen-style comments.

As stated in (b) below, this doesn't really work for r.univar(2), but
I'd agree that r.series's c_*.c should be moved to gmath for other
uses.. (sorry, not by me, not today)


> > b) gmath library: I looked at using the c_var.c & co. functions from
> > r.series, but these require passing all input values (ie the whole
> > map in memory) at once, which while good for a general library
> > function or for n<1000 cells-of-the-same-coordinate like r.series or
> > r.mapcalc might use, it doesn't cut it for a 10000x10000 DCELL map.
> > I guess I could use c_sum.c to do one line at a time, but it doesn't
> > seem worth it, and doesn't get rid of any implementation
> > inconsistencies (eg the n vs. n-1 problem above) which is the great
> > benefit of using a gmath library. So I just reimplemented in an
> > inconsistent manner as described above.
...

> > If people are interested in a replacement to r.univar, I can clean
> > it up and add the missing extended functionality (quartiles, etc.)
> > which r.univar provides. I'm not looking forward to the sort.. so
> > maybe I'll leave that to a real programmer to do.
> > 
> > And the ageless question of what to call it?
> >   ideas: r.mapstats, r.univar2
> 
> IMHO we should aim at *replacing* code, not adding similar
> code with different names. The same applies to various
> other modules such as r.grow[2] etc. Confusions grows...

Ok, how about following r.mapcalc's lead and making its directory in the
source tree r.univar2 but have the program build a r.univar executable.
When people are happy the new version is better than the old it can be
added to the build list and the script removed at the same time. This
minimizes confusion both for future developers and the user.

Everything in the r.univar script is implemented except for the extended
stats (but some framework is in place for someone to continue that) and
the base= map option (just use a MASK). Only minor modification of
dependant scripts is needed (but definitely some).

If that's alright, I'll add it to CVS but not 5.3's build list (for
now). I'd add it to 5.7 though, replacing r.univar[.sh] immediately.
Yes/no?



> See also:
> http://intevation.de/rt/webrt?serial_num=1848&display=History
> "merge of r.average, r.median and r.mode"

I can't do this now. Someone else?



Hamish




More information about the grass-dev mailing list