[STATSGRASS] raster cross-stats
Markus Neteler
neteler at itc.it
Thu Dec 13 09:10:03 EST 2001
Dear Roger,
thanks for the updated GRASS/R interface, I will install later
today. I have played a bit with R BATCH mode to write
pseudo-GRASS script, it's real fun!
Sorry for a FAQ, but (due to my current traveling with temporal
internet access) how can I specify a parameter such as
$1 in shell scripts?
I want to add parameters to the scripts which are visible within
R BATCH. If you could point me to a online document
I would be glad.
Sorry for the basic question,
Markus
On Thu, Dec 13, 2001 at 01:31:30PM +0100, Roger Bivand wrote:
> Hi Agus!
>
> On 13 Dec 2001, Agustin Lobo wrote:
>
> > One suggestion:
> >
> > r.statistics should have the option of having a table as output
> > (similar to r.surface), where the categories would be the rows
> > and the different statistics (men, ver, median, etc) would be
> > the columns. This would be an alternative, more compact
> > format than the current rasters. Also, this table would be
> > easy to be read from R for ulterior analysis.
>
> Great to see you "here" too! I'm enclosing below bits of correspondence
> between Markus, Helena and me of about two weeks ago, together with a
> draft shell script to use R functions for GRASS raster layers. My current
> feeling is that with R being easy to install and maintain, shell scripts
> using R (here for the leics dataset):
>
> > library(GRASS)
> > G <- gmeta()
> > work <- rast.get(G, c("landcov", "topo"), c(T,F))
> > res <- by(work$topo, work$landcov, summary)
> INDICES: Background
> NULL
> ------------------------------------------------------------
> INDICES: Industry
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40.00 42.00 59.00 83.61 120.00 220.00
> ------------------------------------------------------------
> INDICES: Residential
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40.0 53.0 60.0 76.3 80.0 218.0
> ------------------------------------------------------------
> INDICES: Quarry
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40.0 59.0 161.0 135.3 177.0 198.0
> ------------------------------------------------------------
> INDICES: Woodland
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40.0 118.0 156.0 142.9 174.0 248.0
> ------------------------------------------------------------
> INDICES: Arable
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40 63 81 104 153 270
> ------------------------------------------------------------
> INDICES: Pasture
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40.0 75.0 145.0 133.5 180.0 278.0
> ------------------------------------------------------------
> INDICES: Scrub
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 40.0 100.0 161.0 148.2 196.0 270.0
> ------------------------------------------------------------
> INDICES: Water
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 47.00 65.00 99.00 99.27 131.00 221.00
> > matrix(unlist(res), ncol=6)
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] 40.00 60.0 177.0 40 145.0 196.00
> [2,] 42.00 76.3 198.0 63 133.5 270.00
> [3,] 59.00 80.0 40.0 81 180.0 47.00
> [4,] 83.61 218.0 118.0 104 278.0 65.00
> [5,] 120.00 40.0 156.0 153 40.0 99.00
> [6,] 220.00 59.0 142.9 270 100.0 99.27
> [7,] 40.00 161.0 174.0 40 161.0 131.00
> [8,] 53.00 135.3 248.0 75 148.2 221.00
>
> where summary could be a function that is written on the fly before
> submission to R.
>
> I'll be very pleased to hear your thinking - your input to other things -
> like the R pixmap package was helpful!
>
> Roger
>
> --
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> e-mail: Roger.Bivand at nhh.no
> and: Department of Geography and Regional Development, University of
> Gdansk, al. Mar. J. Pilsudskiego 46, PL-81 378 Gdynia, Poland.
>
> ---------- Forwarded message ----------
> Date: Thu, 29 Nov 2001 12:20:01 +0100 (CET)
> From: Roger Bivand <Roger.Bivand at nhh.no>
> To: neteler at itc.it
> Cc: hmitaso at unity.ncsu.edu
> Subject: Re: s.sv, s.normal, s.probplt and R
>
> Markus, Helena:
>
>
>
>
> Further to my message of Monday (below), here is what I sent Frank
> Warnerdam in September when he had asked "Also, how do I get information
> on the min, max, and mean of a raster?" - this is what I was thinking of
> by suggesting that even using R through shell scripts writing R code on
> the fly may be fast enough for some things that don't need R to be seen.
> The s.kcv code is an R function, but it wouldn't be difficult to put it in
> a script in a similar way. The same could be done for plenty of others.
>
> > G <- gmeta()
> > Zinc <- sites.get(G, "ex.Zn.in")
> > names(Zinc)
> [1] "east" "north" "var1" "var2"
> > system("s.univar ex.Zn.in")
> Reading sites list ... 100%
> Calculating statistics (decfield: 1) ... 100%
> number of points 98
> mean 481.031
> standard deviation 398.808
> coefficient of variation 82.9069
> skewness 1.44865
> kurtosis 1.67661
> mean of squares 388815
> mean of absolute values 481.031
> minimum 113
> first quartile 186
> median 307.5
> third quartile 685
> maximum 1839
> > skewness(Zinc$var2)
> [1] 1.426529
> > kurtosis(Zinc$var2)
> [1] 1.581654
> > summary(Zinc$var2)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 113.0 186.3 307.5 481.0 698.5 1839.0
> > mean(Zinc$var2)
> [1] 481.0306
> > sd(Zinc$var2)
> [1] 398.8077
>
> While mean and standard deviation agree between R and s.univar, as do min,
> max, median, skewness and kurtosis don't (quartiles depend on assumptions
> for ties, so aren't so difficult)
>
> > shapiro.test(Zinc$var2)
>
> Shapiro-Wilk normality test
>
> data: Zinc$var2
> W = 0.8186, p-value = 1.259e-09
>
> is a bit easier to read than s.normal! qqnorm(Zinc$var2) or
> ecdf(Zinc$var2) do nice plots too!
>
> As far as s.sv goes, several of the geostats packages in R do the same
> too.
>
> > system("s.sv sites=ex.Zn.in lag=50")
> Reading sites list ... 100%
> 98 sites found
> Computing sample semivariogram ... 100%
> 50 67827.5 22
> 100 67968.9 94
> 150 69124.6 200
> 200 97473.3 196
> 250 78232.2 220
> 300 124723 256
> 350 102276 248
> 400 166055 242
> 450 140789 284
> 500 137712 290
>
> > library(fields)
> > try2 <- vgram(cbind(Zinc$east, Zinc$north), Zinc$var2, dmax=1000,
> breaks=seq(50,1000,50))
> > try2$breaks[1:10]
> [1] 50 100 150 200 250 300 350 400 450 500
> > try2$stats[1:3,1:10]
> 1 2 3 4 5 6 7
> N 31.00 78.00 98.0 104.00 117.0 119.00 130.0
> mean 52392.50 61888.99 101409.0 71277.74 127455.7 85657.06 149756.9
> Std.Dev. 91029.62 129359.38 168443.0 153194.52 204767.0 165459.43 280845.6
> 8 9 10
> N 137.0 132.0 149.0
> mean 122474.6 165412.9 163237.1
> Std.Dev. 225055.5 301939.9 279704.7
>
> > plot(try2$d, try2$vgram)
> > lines(try2$centers, try2$stats["mean",], col=4)
>
> just from Doug Nychka's fields package, which includes robust variograms
> too, and variograms from rasters.
>
> Maybe this is not really necessary at all, but working up these kinds of
> analytical things is just so much easier in R - and more people use them
> so that the chance of bugs not being found is much less - that at least
> for me it is what I choose.
>
> Yours,
>
> Roger
>
> --
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> e-mail: Roger.Bivand at nhh.no
> and: Department of Geography and Regional Development, University of
> Gdansk, al. Mar. J. Pilsudskiego 46, PL-81 378 Gdynia, Poland.
>
> Message of Monday 26/11:
>
> I'm in Gdansk now, so can't try things until later in the week. My feeling
> is that the key question here is how many GRASS users who use these kinds
> of functionalities would benefit in the longer run from moving the
> analysis
> bits to the R engine, over the interface at the moment. I've already
> written a couple of scripts to do some of this from the GRASS prompt. My
> preference would be to use R functionality where appropriate, because it
> is
> used by more people and more likely for that reason to be cleaner. I
> remember an exchange some months ago when Frank Warnerdam wanted a better
> r.average, and that wasn't a problem to do as a script writing a short R
> script to collect the raster or whatever, compute the result, and if
> necessary push it back across the interface. But then R has to be
> installed, as well as the interface library.
>
> An alternative is to use the interface to help prototype things, then code
> them in C on the R side. Certainly what I saw of one s.* program (to
> choose
> a training and a test set from sites) made me distinctly unhappy - it was
> doing things that not only seemed unnecessary, and doing them rather
> clumsily.
>
>
> in reply to:
>
> as you know we are working on the GRASS book. Helena is
> revising the sites chapter and asked about a few
> modules. As you may have tested them already, I would
> like to know if you have hints for us:
>
> On Sun, Nov 25, 2001 at 08:41:55AM -0600, Helena wrote:
> > Markus,
> >
> > I am working on chapter7 (sites) and I would like to ask whether you
> have
> > any experience with s.sv, s.normal, s.probplt, s.medp. At least s.sv and
> > s.normal appear to be working but one needs g.gnuplot to get the
> graphics.
> > Also I do not know whether these tools are all designed properly. Did
> you
> > ever try to compare the results with those from R-stats?
> > Has Roger Bivand tried it - does he think that they are useful?
>
> Today I have exported the zinc data from Maas and put into
> s.sv. Well, it's looking a bit different.
> But I am not sure if both implementations are comparable. Did
> you ever try above mentioned GRASS modules?
>
> BTW: All only require gnuplot, not g.gnuplot (we had implemented the
> environment variable GRASS_GNUPLOT which is per default "gnuplot").
> If this doesn't work, you GRASS is too old... The change was done several
> weeks ago.
>
> The question for us is if to mention the modules or not (and refer to
> R, I just added QQ plots to the related section).
> ---------- Forwarded message ----------
> Date: Thu, 6 Sep 2001 10:50:17 +0200 (CEST)
> From: Roger Bivand <Roger.Bivand at nhh.no>
> To: Frank Warmerdam <warmerdam at pobox.com>
> Subject: Re: [GRASS5] r.in.gdal and GCPs
>
> Frank:
>
> Here is a rough draft of a shell script to do what you want (I think):
> (there is a time downside to loading R first, but the time taken to read
> the raster layer should be comparable, since both the interface and say
> r.stats are using the same functions in libgis.a).
>
> -------------------
> #!/bin/sh
> # r.statistics.R: shell script to call simple summary functions from R
> # (needs installed R in path, and installed R GRASS package; for
> # skewness and kurtosis also e1071 package)
> #
> # usage: r.statistics.R layer method
> # where layer is a raster map layer and method is one of:
> # summary fivenum mean max min median length sum var sd kurtosis skewness
> #
>
> layer=$1
>
> r.info "$layer" 2>&1 > /dev/null
> if [ "$?" == 1 ] ; then
> echo "$layer: no such raster map layer"
> exit 1
> fi
>
> method=$2
> found=`grep "$method" << EOF
> summary fivenum mean max min median length sum var sd kurtosis skewness
> EOF`
>
> if [ "$?" == 1 ] ; then
> echo "$method: no such method"
> exit 1
> fi
>
> e1071="FALSE"
>
> if [ "$method" == "kurtosis" ] ; then e1071="TRUE" ; fi
> if [ "$method" == "skewness" ] ; then e1071="TRUE" ; fi
>
> if [ "$e1071" == "TRUE" ] ; then
>
> R --vanilla --quiet --slave << EOF
> library(e1071)
> library(GRASS)
> print($method(rast.get(gmeta(), "$layer")[[1]]))
> EOF
>
> else
>
> R --vanilla --quiet --slave << EOF
> library(GRASS)
> print($method(rast.get(gmeta(), "$layer")[[1]]))
> EOF
>
> fi
>
> ------------
>
> Roger
>
> --
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
> e-mail: Roger.Bivand at nhh.no
> and: Department of Geography and Regional Development, University of
> Gdansk, al. Mar. J. Pilsudskiego 46, PL-81 378 Gdynia, Poland.
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> statsgrass mailing list
> statsgrass at grass.itc.it
> http://grass.itc.it/mailman/listinfo/statsgrass
--
Markus Neteler
ITC-irst, Istituto per la Ricerca Scientifica e Tecnologica
Project on Predictive Models for the Environment
Via Sommarive, 18 - 38050 Povo (Trento), Italia
tel +39 0461 314 -520 (fax -591) http://mpa.itc.it
More information about the grass-stats
mailing list