univariate stat module
Darrell McCauley
mccauley at ecn.purdue.edu
Tue Apr 13 13:41:41 EDT 1993
Dan Riker (riker at hydro1.geo.duke.edu) recently expressed
the need for a general univariate statistics module
for GRASS (Re: Hydrologic Modeling and GRASS).
What should the design criteria be?
o How should such a module work? Should it just accept
column(s) of data (e.g. output of r.stats, output of
s.out.ascii) or should it access data directly?
o What statistics should be calculated? (mean, std dev,
variance, skewness, everything that the SAS "PROC UNIVAR"
does?)
o What output format would work best?
I think that if grassu people define what is needed, it would
take only a small effort for a grassp person to do this.
Afterall, there are several sources of free code to do these
calculations, some already packaged up as UNIX commands.
Working from one of these, all that is needed is a GRASS
wrapper (parser).
BTW, I vaguely remember a shell script to do univariate
statistics. I'm not sure who wrote it, but it is appended
for your enjoyment.
--Darrell
(I had this saved as r.univar)
#!/bin/sh
while test $# != 0
do
case "$1" in
-z) z=z;shift;;
-v) v=v;shift;;
-zv|-vz) z=z;v=v;shift;;
-|-*) oops=yes;break;;
*) break;;
esac
done
if test $# != 1 -o "$oops" = yes
then
echo "Usage: `basename $0` [-vz] cellfile" >&2
exit 1
fi
r.stats -c$z$v "$1" | awk '
BEGIN{sum=0.0;sum2=0.0}
NR==1{min=$1; max=$1}
{sum += $1 * $2; sum2 += $1 * $1 * $2; N += $2}
{if($1 > max) max = $1; if ($1 < min) min = $1}
{if($2 > modecount) {mode=$1; modecount=$2}}
END{
print "min ", min
print "max ", max
print "mean ", sum/N
print "mode ", mode
print "variance ", (sum2 - sum*sum/N)/N
print "deviation", sqrt((sum2 - sum*sum/N)/N)
}'
More information about the grass-user
mailing list