[STATSGRASS] raster cross-stats

Markus Neteler neteler at itc.it
Thu Dec 13 09:10:03 EST 2001


Dear Roger,

thanks for the updated GRASS/R interface, I will install later
today. I have played a bit with R BATCH mode to write
pseudo-GRASS script, it's real fun!
Sorry for a FAQ, but (due to my current traveling with temporal
internet access) how can I specify a parameter such as
$1 in shell scripts?
I want to add parameters to the scripts which are visible within
R BATCH. If you could point me to a online document
I would be glad.

Sorry for the basic question,

 Markus

On Thu, Dec 13, 2001 at 01:31:30PM +0100, Roger Bivand wrote:
> Hi Agus!
> 
> On 13 Dec 2001, Agustin Lobo wrote:
> 
> > One suggestion:
> > 
> > r.statistics should have the option of having a table as output
> > (similar to r.surface), where the categories would be the rows
> > and the different statistics (men, ver, median, etc) would be
> > the columns. This would be an alternative, more compact
> > format than the current rasters. Also, this table would be
> > easy to be read from R for ulterior analysis.
> 
> Great to see you "here" too! I'm enclosing below bits of correspondence
> between Markus, Helena and me of about two weeks ago, together with a
> draft shell script to use R functions for GRASS raster layers. My current
> feeling is that with R being easy to install and maintain, shell scripts
> using R (here for the leics dataset):
> 
> > library(GRASS)
> > G <- gmeta()
> > work <- rast.get(G, c("landcov", "topo"), c(T,F))
> > res <- by(work$topo, work$landcov, summary)
> INDICES: Background
> NULL
> ------------------------------------------------------------ 
> INDICES: Industry
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>   40.00   42.00   59.00   83.61  120.00  220.00 
> ------------------------------------------------------------ 
> INDICES: Residential
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>    40.0    53.0    60.0    76.3    80.0   218.0 
> ------------------------------------------------------------ 
> INDICES: Quarry
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>    40.0    59.0   161.0   135.3   177.0   198.0 
> ------------------------------------------------------------ 
> INDICES: Woodland
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>    40.0   118.0   156.0   142.9   174.0   248.0 
> ------------------------------------------------------------ 
> INDICES: Arable
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>      40      63      81     104     153     270 
> ------------------------------------------------------------ 
> INDICES: Pasture
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>    40.0    75.0   145.0   133.5   180.0   278.0 
> ------------------------------------------------------------ 
> INDICES: Scrub
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>    40.0   100.0   161.0   148.2   196.0   270.0 
> ------------------------------------------------------------ 
> INDICES: Water
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>   47.00   65.00   99.00   99.27  131.00  221.00 
> > matrix(unlist(res), ncol=6)
>        [,1]  [,2]  [,3] [,4]  [,5]   [,6]
> [1,]  40.00  60.0 177.0   40 145.0 196.00
> [2,]  42.00  76.3 198.0   63 133.5 270.00
> [3,]  59.00  80.0  40.0   81 180.0  47.00
> [4,]  83.61 218.0 118.0  104 278.0  65.00
> [5,] 120.00  40.0 156.0  153  40.0  99.00
> [6,] 220.00  59.0 142.9  270 100.0  99.27
> [7,]  40.00 161.0 174.0   40 161.0 131.00
> [8,]  53.00 135.3 248.0   75 148.2 221.00
> 
> where summary could be a function that is written on the fly before
> submission to R.
> 
> I'll be very pleased to hear your thinking - your input to other things -
> like the R pixmap package was helpful!
> 
> Roger
> 
> -- 
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> e-mail: Roger.Bivand at nhh.no
> and: Department of Geography and Regional Development, University of
> Gdansk, al. Mar. J. Pilsudskiego 46, PL-81 378 Gdynia, Poland.
> 
> ---------- Forwarded message ----------
> Date: Thu, 29 Nov 2001 12:20:01 +0100 (CET)
> From: Roger Bivand <Roger.Bivand at nhh.no>
> To: neteler at itc.it
> Cc: hmitaso at unity.ncsu.edu
> Subject: Re: s.sv, s.normal, s.probplt and R
> 
> Markus, Helena:
> 
> 
> 
> 
> Further to my message of Monday (below), here is what I sent Frank
> Warnerdam in September when he had asked "Also, how do I get information
> on the min, max, and mean of a raster?" - this is what I was thinking of
> by suggesting that even using R through shell scripts writing R code on
> the fly may be fast enough for some things that don't need R to be seen.
> The s.kcv code is an R function, but it wouldn't be difficult to put it in
> a script in a similar way. The same could be done for plenty of others.
> 
> > G <- gmeta()
> > Zinc <- sites.get(G, "ex.Zn.in")
> >  names(Zinc)
> [1] "east"  "north" "var1"  "var2" 
> > system("s.univar ex.Zn.in")
> Reading sites list ...                   100%
> Calculating statistics (decfield: 1) ... 100%
>         number of points 98
>                     mean 481.031
>       standard deviation 398.808
> coefficient of variation 82.9069
>                 skewness 1.44865
>                 kurtosis 1.67661
>          mean of squares 388815
>  mean of absolute values 481.031
>                  minimum 113
>           first quartile 186
>                   median 307.5
>           third quartile 685
>                  maximum 1839
> > skewness(Zinc$var2)
> [1] 1.426529
> > kurtosis(Zinc$var2)
> [1] 1.581654
> > summary(Zinc$var2)
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
>   113.0   186.3   307.5   481.0   698.5  1839.0 
> > mean(Zinc$var2)
> [1] 481.0306
> > sd(Zinc$var2)
> [1] 398.8077
> 
> While mean and standard deviation agree between R and s.univar, as do min,
> max, median, skewness and kurtosis don't (quartiles depend on assumptions
> for ties, so aren't so difficult)
> 
> > shapiro.test(Zinc$var2)
> 
>         Shapiro-Wilk normality test
> 
> data:  Zinc$var2 
> W = 0.8186, p-value = 1.259e-09 
> 
> is a bit easier to read than s.normal! qqnorm(Zinc$var2) or
> ecdf(Zinc$var2) do nice plots too!
> 
> As far as s.sv goes, several of the geostats packages in R do the same
> too.
> 
> > system("s.sv sites=ex.Zn.in lag=50")
> Reading sites list ...                   100%
> 98 sites found
> Computing sample semivariogram ...   100%
> 50 67827.5 22
> 100 67968.9 94
> 150 69124.6 200
> 200 97473.3 196
> 250 78232.2 220
> 300 124723 256
> 350 102276 248
> 400 166055 242
> 450 140789 284
> 500 137712 290
> 
> > library(fields)
> > try2 <- vgram(cbind(Zinc$east, Zinc$north), Zinc$var2, dmax=1000,
> breaks=seq(50,1000,50))
> > try2$breaks[1:10]
>  [1]  50 100 150 200 250 300 350 400 450 500
> > try2$stats[1:3,1:10]
>                 1         2        3         4        5         6        7
> N           31.00     78.00     98.0    104.00    117.0    119.00    130.0
> mean     52392.50  61888.99 101409.0  71277.74 127455.7  85657.06 149756.9
> Std.Dev. 91029.62 129359.38 168443.0 153194.52 204767.0 165459.43 280845.6
>                 8        9       10
> N           137.0    132.0    149.0
> mean     122474.6 165412.9 163237.1
> Std.Dev. 225055.5 301939.9 279704.7
> 
> > plot(try2$d, try2$vgram)
> > lines(try2$centers, try2$stats["mean",], col=4)
> 
> just from Doug Nychka's fields package, which includes robust variograms
> too, and variograms from rasters.
> 
> Maybe this is not really necessary at all, but working up these kinds of
> analytical things is just so much easier in R - and more people use them
> so that the chance of bugs not being found is much less - that at least
> for me it is what I choose.
> 
> Yours,
> 
> Roger
> 
> -- 
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> e-mail: Roger.Bivand at nhh.no
> and: Department of Geography and Regional Development, University of
> Gdansk, al. Mar. J. Pilsudskiego 46, PL-81 378 Gdynia, Poland.
> 
> Message of Monday 26/11:
> 
> I'm in Gdansk now, so can't try things until later in the week. My feeling 
> is that the key question here is how many GRASS users who use these kinds 
> of functionalities would benefit in the longer run from moving the
> analysis 
> bits to the R engine, over the interface at the moment. I've already 
> written a couple of scripts to do some of this from the GRASS prompt. My 
> preference would be to use R functionality where appropriate, because it
> is 
> used by more people and more likely for that reason to be cleaner. I 
> remember an exchange some months ago when Frank Warnerdam wanted a better 
> r.average, and that wasn't a problem to do as a script writing a short R 
> script to collect the raster or whatever, compute the result, and if 
> necessary push it back across the interface. But then R has to be 
> installed, as well as the interface library.
> 
> An alternative is to use the interface to help prototype things, then code 
> them in C on the R side. Certainly what I saw of one s.* program (to
> choose 
> a training and a test set from sites) made me distinctly unhappy - it was 
> doing things that not only seemed unnecessary, and doing them rather 
> clumsily.
> 
> 
> in reply to:
> 
> as you know we are working on the GRASS book. Helena is
> revising the sites chapter and asked about a few
> modules. As you may have tested them already, I would
> like to know if you have hints for us:
> 
> On Sun, Nov 25, 2001 at 08:41:55AM -0600, Helena wrote:
> > Markus,
> > 
> > I am working on chapter7 (sites) and I would like to ask whether you
> have
> > any experience with s.sv, s.normal, s.probplt, s.medp. At least s.sv and
> > s.normal appear to be working but one needs g.gnuplot to get the
> graphics.
> > Also I do not know whether these tools are all designed properly. Did
> you
> > ever try to compare the results with those from R-stats?
> > Has Roger Bivand tried it - does he think that they are useful?
> 
> Today I have exported the zinc data from Maas and put into
> s.sv. Well, it's looking a bit different.
> But I am not sure if both implementations are comparable. Did
> you ever try above mentioned GRASS modules?
> 
> BTW: All only require gnuplot, not g.gnuplot (we had implemented the
> environment variable GRASS_GNUPLOT which is per default "gnuplot").
> If this doesn't work, you GRASS is too old... The change was done several
> weeks ago.
> 
> The question for us is if to mention the modules or not (and refer to
> R, I just added QQ plots to the related section).
> ---------- Forwarded message ----------
> Date: Thu, 6 Sep 2001 10:50:17 +0200 (CEST)
> From: Roger Bivand <Roger.Bivand at nhh.no>
> To: Frank Warmerdam <warmerdam at pobox.com>
> Subject: Re: [GRASS5] r.in.gdal and GCPs
> 
> Frank:
> 
> Here is a rough draft of a shell script to do what you want (I think):
> (there is a time downside to loading R first, but the time taken to read
> the raster layer should be comparable, since both the interface and say
> r.stats are using the same functions in libgis.a).
> 
> -------------------
> #!/bin/sh
> # r.statistics.R: shell script to call simple summary functions from R
> # (needs installed R in path, and installed R GRASS package; for
> # skewness and kurtosis also e1071 package)
> #
> # usage: r.statistics.R layer method
> # where layer is a raster map layer and method is one of:
> # summary fivenum mean max min median length sum var sd kurtosis skewness
> #
> 
> layer=$1
> 
> r.info "$layer" 2>&1 > /dev/null
> if [ "$?" == 1 ] ; then
> echo "$layer: no such raster map layer"
> exit 1
> fi
> 
> method=$2
> found=`grep "$method" << EOF
> summary fivenum mean max min median length sum var sd kurtosis skewness
> EOF`
> 
> if [ "$?" == 1 ] ; then
> echo "$method: no such method"
> exit 1
> fi
> 
> e1071="FALSE"
> 
> if [ "$method" == "kurtosis" ] ; then e1071="TRUE" ; fi
> if [ "$method" == "skewness" ] ; then e1071="TRUE" ; fi
> 
> if [ "$e1071" == "TRUE" ] ; then
> 
> R --vanilla --quiet --slave << EOF
> library(e1071)
> library(GRASS)
> print($method(rast.get(gmeta(), "$layer")[[1]]))
> EOF
> 
> else
> 
> R --vanilla --quiet --slave << EOF
> library(GRASS)
> print($method(rast.get(gmeta(), "$layer")[[1]]))
> EOF
> 
> fi
> 
> ------------
> 
> Roger
> 
> -- 
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> e-mail: Roger.Bivand at nhh.no
> and: Department of Geography and Regional Development, University of
> Gdansk, al. Mar. J. Pilsudskiego 46, PL-81 378 Gdynia, Poland.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> statsgrass mailing list
> statsgrass at grass.itc.it
> http://grass.itc.it/mailman/listinfo/statsgrass

-- 
Markus Neteler

ITC-irst, Istituto per la Ricerca Scientifica e Tecnologica
     Project on Predictive Models for the Environment    
Via Sommarive, 18        -      38050 Povo (Trento), Italia
tel +39 0461 314 -520 (fax -591)          http://mpa.itc.it



More information about the grass-stats mailing list