[GRASSLIST:9001] Re: r.statistics vs v.rast.stats vs ??

David Finlayson david.p.finlayson at gmail.com
Thu Nov 10 22:15:00 EST 2005


In my case (very large DEM small vector areas) the creation of the
MASK during each pass through the loop was the rate-limiting step. I
added the following lines to reduce the size of the region prior to
creating the MASK. It helped dramatically:

# Reduce size of region prior to calculating mask
g.remove vect=${VECTOR}_PTEMP
v.extract input=${VECTOR} output=${VECTOR}_PTEMP type=area new=1 list=$i
g.region vect=${VECTOR}_PTEMP

I need to do more testing to see if it is getting the same answer as before.

Calculating the statistics multiple times is redundant, but for my
dataset it was not slow (less than 1 second for all of them). I wonder
if my areas were larger would it take longer?

Now the only slow part is updating sqlite. Seems to take a second or
so for each update call.

David

On 11/10/05, Hamish <hamish_nospam at yahoo.com> wrote:
> > I am trying to calculate statistics such as mean elevation, slope,
> > maximum drainage area etc for a very large number of small plots. The
> > plots are in a vector file and I have been trying to use v.rast.stats
> > on the various rasters that represent elevation, slope, drainage area,
> > etc.
> >
> > The problem is that the rasters are very high resolution, the plots
> > are quite small, and I have more than 600 to process. If you look at
> > the code for v.rast.stats, it converts each plot into a raster to use
> > as a mask and then runs r.univar on the entire map (now mostly null
> > values), once for each statistic. In my situation it can take several
> > minutes for each plot. In other words hours for each of the above
> > rasters that I am interested in and more than 2 days for the whole
> > set.
> >
> > Is there a more efficient way of calculating these statistics? I have
> > tried to use r.statistics and can't make sense of the inputs and
> > outputs. Alternatively, if I could control the region size in the
> > critical part of the v.rast.stats script I might be able to improve
> > the speed of the script.
>
>
> for i in $CATSLIST ; do
>  echo "Processing category $i ($CURRNUM/$NUMBER)"
>  g.remove MASK > /dev/null 2> /dev/null
>  r.mapcalc "MASK=if(${VECTOR}_${TMPNAME} == $i, 1, null())" 2> /dev/null
>  #n, min, max, range, mean, stddev, variance, coeff_var, sum
>  n=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f1`
>  min=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f2`
>  max=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f3`
>  range=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f4`
>  mean=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f5`
>  stddev=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f6`
>  variance=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f7`
>  cf_var=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f8`
>  sum=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f9`
>
>
> a couple of things,
>
> 1) r.univar is running 9 times when it only needs to be run once:
> eval `r.univar -g $RASTER`
>
> Then $cf_var gets renamed $coeff_var, needs to be carried through.
>
>
> 2) if g.region zoom= respects MASK, we can do:
> g.region save=${TMPNAME}_rastres
>
> for i in $CATSLIST ; do
>  echo "Processing category $i ($CURRNUM/$NUMBER)"
>  g.remove MASK > /dev/null 2> /dev/null
>  r.mapcalc "MASK=if(${VECTOR}_${TMPNAME} == $i, 1, null())" 2> /dev/null
>
> +g.region zoom=$RASTER
>
>  #n, min, max, range, mean, stddev, variance, coeff_var, sum
>  eval `r.univar -g $RASTER`
>
> +g.region ${TMPNAME}_rastres
>
>  ...
> done
>
> g.remove region=${TMPNAME}_rastres
>
>
>
> but g.region zoom= can take a _long_ time so maybe no advantage/worse?
>
> Would need to check/reset NSRES and EWRES to raster's values or
> is that redundant? (what about g.region -a?) Some checking needs
> to be done that we're not screwing up the cells.. at minimum
> a before/after check to see that the same answer comes out both
> ways.
>
>
> 3) rm twice??
> cleanup()
> {
>  #restore settings:
>  g.region region=${TMPNAME}
>  g.remove region=${TMPNAME} > /dev/null
>
> twice??
>
> 4) isn't  -p > /dev/null  circular?
> g.region nsres=$NSRES ewres=$EWRES -ap > /dev/null
>
>
> 5) cleanup should happen before   echo "Done."
>
>
> Hamish
>


--
David Finlayson
Marine Geology & Geophysics
School of Oceanography
Box 357940
University of Washington
Seattle, WA  98195-7940
USA

Office: Marine Sciences Building, Room 112
Phone: (206) 616-9407
Web: http://students.washington.edu/dfinlays




More information about the grass-user mailing list