[GRASS-stats] universal kriging with spgrass6

Roger Bivand Roger.Bivand at nhh.no
Mon Jun 29 09:58:32 EDT 2009


On Mon, 29 Jun 2009, mathieu grelier wrote:

> ok
> First I think everyone here has its own domain of competence, so please
> don't answer as if I was at school, I'm not sure all these classes are so
> obvious for all GRASS users. Please clearly display the R requirements to
> post in the list if they exist
> I just try to implement UK in a GRASS process for a larger mapping system
> architecture. I think in some way it could be feasible : it should not be
> for me to understand how.

It is always the responsibility of the implementer to back-track to where 
his or her understanding is wrong. None of the open source software that 
you are using comes with any warranty whatsoever.

You can do UK, but not trend surface, in a predictable way, because both 
the data= and newdata= objects have to contain the same variables.

Doing UK with trend surfaces is different, because the "proper" gstat 
argument is degree=, not including the coordinates as variables. The 
reason for this is numeric, products and powers of large numbers, like 
meters from the Equator, overflow double precision very quickly. When 
degree= is given, the code internally rescales the coordinates before 
taking powers and products to avoid these numerical problems - in gstat 
source src/data.c, the calc_polynomial() function rescales - for example:

x' = (x-min(x))/(max(x)-min(x))

If what you want to do is UK with a trend surface, you will have to work 
though this from the bottom up. In fact, I'm surprised that autoKrige() 
uses krige() in gstat, rather than first making a gstat() object then 
predicting from it, making it harder to see what is going on. I suspect 
that some of the difficulties here are actually caused by:

library(gstat)
data(meuse)
coordinates(meuse) <- c("x", "y")
m1 <- gstat(id="m1", formula=zinc ~ 1, data=meuse, degree=1)
v1 <- variogram(m1)
#Error in variogram.gstat(m1) :
#degree != 0: residual variograms wrt coord trend using degree not supported

which prevents autofitVariogram() from doing what it needs to in the 
rescaled trend surface case. This is I think supported in free-standing 
gstat, but not in the R package, I believe.

data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
m1p <- predict(m1, meuse.grid)

does work happily, but getting to UK with a trend surface is messy, as 
you've found. It may be that I'm wrong, and that the gstat() formula 
interface does recognise "x" and "y" as coordinates - it seems to, and 
that this carries forward to prediction, but I can't see where this is in 
the gstat package R or C code. The comment in src/s.c that degree > 0 will 
only work "for a single variable" and "when no other predictors are given" 
suggests that doing UK with a trend surface using a variogram to fit 
a variogram model isn't so easy after all.

>
> Also always consider improving your doc and/or your design : it shouldn't be
> normal for a non-specialist to spend so much time to understand how to
> manipulate the objects to achieve UK with GRASS.
>

Welcome to reality, sorry, but this really is how things are, even in 
proprietary software, you'll need to know, or reverse-engineer, how things 
do work, in order to have adequate confidence in your results.

> Obviously I don't have time to study the R internals and I've just tried to
> quickly reproduce the working R code with meuse dataset, which use a
> dataframe as input data.
> By the way, the rest of the code is yours, you personally gave me at the
> time.

That doesn't mean it works, nobody is infallible. There are consequences 
of design choices often made for different purposes long ago. The degree= 
problem is one, another is column names, neither have obvious solutions. 
Having it work in automap probably means adding checks in autoKrige(), and 
then altering variogram() and other functions in gstat() to do intial 
checks and pass the degree= value through, so changes to two underlying 
packages. If you can volunteer patches to them, this may get resolved 
faster.

Alternatively, you could do the rescaling yourself first for both 
the data= and the newdata= arguments to autoKrige(), having first added X 
and Y coordinate variables to the objects. Check first whether you can get 
the same trend surface predictions from my code above, from rescaled 
variables in the same format, and finally through autoKrige() - though 
here I can't see how to just do a trend surface without the variogram 
fitting (model=NULL?).

Roger

> I understood that this is a colunm name problem and now I'm no longer sure
> what I want to do is possible without modification in gstat.
> But I thought maybe there was a workaround?
>
> Now I see it makes no sense but the initial R example seems to work with a
> data.frame as input data and it induced me in error.
> Best regards.
>
> Mathieu
>
>
> 2009/6/29 Roger Bivand <Roger.Bivand at nhh.no>
>
>> On Mon, 29 Jun 2009, mathieu grelier wrote:
>>
>>  ok,
>>> From your answers, I've tried to play with the SpatialPointsDataFrame
>>> object
>>> imported from GRASS to fit the gstat krige function requirements. Basic
>>> idea
>>> seemed to work with the "data" attribute of the imported object, adding
>>> the
>>> missing coordinates columns :
>>>
>>>  sitesR <- readVECT6("grass_sites")
>>>> class(sitesR)
>>>> [1] "SpatialPointsDataFrame"
>>>> d = attr(sitesR,"data")
>>>> d
>>>>
>>>
>>>          site value cat
>>> 1           Ail      81.80   1
>>> 2     All      38.50   2
>>> 3     Arg     103.50   3
>>> 4           Avb      52.50   4
>>> ...
>>>
>>>  sitesR$x <- coordinates(sitesR)[,1]
>>>> sitesR$y <- coordinates(sitesR)[,2]
>>>> d = attr(sitesR,"data")
>>>>
>>>
>> **NO**, there are no attr() in an S4 class. Just do the obvious thing:
>>
>> d <- as(sitesR, "data.frame")
>>
>> **BUT** you shouldn't be doing this at all, the data= argument in gstat can
>> take a SpatialPointsDataFrame.
>>
>>  class(d)
>>>>
>>> [1] "data.frame"
>>>
>>>> d
>>>>
>>>          site value cat        x       y
>>> 1           Ail      81.80   1 825094.9 6796083
>>> 2     All      38.50   2 758780.4 6851508
>>> 3     Arg     103.50   3 818973.3 6796125
>>> 4           Avb      52.50   4 775136.0 6877271
>>> ...
>>>
>>>  coordinates(d) = ~x+y
>>>>
>>>
>> (Check, it has lost its x and y columns again, hasn't it?)
>> str(d)
>>
>>  attr(d, "proj4string") <-CRS(G$proj4)
>>>>
>>>
>> **NO**!!! These are S4 classes, no attr()!!!
>>
>>
>>>  G <- gmeta6()
>>>> grd <- gmeta2grd()
>>>>
>>>
>> **PLEASE** don't do this! You cannot set one slot without resetting the
>> others if you want to cover the GRASS region. If you want something
>> different, use g.region in GRASS and just gmeta2grd().
>>
>>  slot(grd, "cellsize") <- c(1000, 1000)
>>>> data <- data.frame(list(k=rep(1,(floor(G$cols)*floor(G$rows)))))
>>>> mask_SG <- SpatialGridDataFrame(grd, data=data, CRS(G$proj4))
>>>>
>>>
>>>
>> What is this supposed to be? What we've told you already is that mask_SG
>> needs to be a Spatial*DataFrame with "x" and "y" columns, why are you not
>> doing it?
>>
>>
>>>  predictors = "x+y"
>>>> kriging_result = autoKrige(as.formula(paste(column,"~",predictors)), d,
>>>>
>>> mask_SG)
>>>
>>> Result was :
>>>
>>> Error in gstat.formula.predict(d$formula, newdata, na.action = na.action,
>>>  :
>>>
>>>  NROW(locs) != NROW(X): this should not occur
>>> Warning messages:
>>> 1: 'newdata' had 7900 rows but variable(s) found have 73 rows
>>> 2: 'newdata' had 7900 rows but variable(s) found have 73 rows
>>>
>>>
>> No traceback()? Why? Isn't it obvious that it has found the x and y values
>> in d (73 long)?
>>
>>  1)do you think this approach could be successful?
>>> 2)If so, can you help me with this error?
>>>
>>
>> Try doing everything in very small steps manually. Use gstat not automap,
>> to avoid the extra layer of uncertainty. Simply try to set up a gstat()
>> object for trend surface and predict from it:
>>
>> g_tr1a <- gstat("tr1a", <whatever column was> ~ 1, d, degree=1)
>> pr_tr1a <- predict(g_tr1a, mask_SG)
>>
>> Once that (untried) works, go further, not until. I still suspect that this
>> is to do with the colnames of the coordinates, (x, y) in one case and
>> (coords1, coords2) or some such in the other.
>>
>> Please do study the gstat help pages and the examples, they will help, but
>> you are jumping to far too many conclusions.
>>
>> Roger
>>
>>
>>
>>> Mathieu
>>>
>>> 2009/6/26 Edzer Pebesma <edzer.pebesma at uni-muenster.de>
>>>
>>>
>>>>
>>>> Roger Bivand wrote:
>>>>
>>>>> On Thu, 25 Jun 2009, Edzer Pebesma wrote:
>>>>>
>>>>>  Mathieu, that could very well be the case. Note that x+y do not
>>>>>> "generally" refer to x- and y-coordinates, but are the actual names of
>>>>>> the coordinates (or something else). From your example I cannot find
>>>>>> out
>>>>>> whether the coordinates are named x and y. In addition, they should
>>>>>> have
>>>>>> identical names in sitesR and mask_SG.
>>>>>>
>>>>>
>>>>> Yes, it is beginning to look like the names in mask_SG, isn't it? Do
>>>>> you have a view on whether degree= should be mandatory, or does the
>>>>> infrastructure recognise coordinate RHS variables and rescale them
>>>>> automatically (I doubt the latter)? Why do users use trend variables
>>>>> in this way - the range of values in the (X'X) matrix becomes enormous
>>>>> with the usual metre metric for coordinates?
>>>>>
>>>> Another question we could ask (ourselves) is why in R we allow users to
>>>> give their coordinates the names they like. I remember that for that
>>>> sole reason (need to rename) I added the function
>>>>
>>>> coordnames(x) = c("u", "v")
>>>>
>>>> to sp. In case they're called "lon" and "lat", they also don't change
>>>> name after projection. But maybe I talk too much with people worrying
>>>> about semantics, these days. ;-)
>>>> --
>>>> Edzer
>>>>
>>>>>
>>>>> Roger
>>>>>
>>>>>
>>>>>> Best regards,
>>>>>> --
>>>>>> Edzer
>>>>>>
>>>>>> mathieu grelier wrote:
>>>>>>
>>>>>> Dear all,
>>>>>> I am trying to achieve universal kriging with GRASS and R through
>>>>>> spgrass6
>>>>>> package.
>>>>>> Maybe you could help me to find the right R syntax.
>>>>>> As in the gstat doc, the polynomial I try to use for now is "x+y"
>>>>>>
>>>>>>  Dear all,
>>>>>>> I am trying to achieve universal kriging with GRASS and R through
>>>>>>> spgrass6
>>>>>>> package.
>>>>>>> Maybe you could help me to find the right R syntax.
>>>>>>> As in the gstat doc, the polynomial I try to use for now is "x+y"
>>>>>>>
>>>>>>> In R, with automap package, this works :
>>>>>>>
>>>>>>>  data(meuse)
>>>>>>>> coordinates(meuse) = ~x+y
>>>>>>>> data(meuse.grid)
>>>>>>>> gridded(meuse.grid) = ~x+y
>>>>>>>> column = "zinc"
>>>>>>>>
>>>>>>>>  ##Ordinary kriging :
>>>>>>>
>>>>>>>  predictors = "1"
>>>>>>>> autoKrige(as.formula(paste(column,"~", predictors)), meuse,
>>>>>>>> meuse.grid)
>>>>>>>>
>>>>>>>>  ##Universal kriging :
>>>>>>>
>>>>>>>  predictors = "x+y"
>>>>>>>> autoKrige(as.formula(paste(column,"~", predictors)), meuse,
>>>>>>>> meuse.grid)
>>>>>>>>
>>>>>>>>
>>>>>>> But using spgrass6, it becomes (some steps have been skipped):
>>>>>>>
>>>>>>>  sitesR <- readVECT6("grass_sites")
>>>>>>>>
>>>>>>>>  ...
>>>>>>>
>>>>>>>  mask_SG <- SpatialGridDataFrame(grd, data=data, CRS(G$proj4))
>>>>>>>>
>>>>>>>>  ...
>>>>>>>
>>>>>>> ##Ordinary kriging => WORKS:
>>>>>>>
>>>>>>>  predictors = "1"
>>>>>>>> kriging_result = autoKrige(as.formula(paste(column,"~",predictors)),
>>>>>>>>
>>>>>>>>  sitesR, mask_SG)
>>>>>>>
>>>>>>> ##Universal kriging => ERROR:
>>>>>>>
>>>>>>>  predictors = "x+y"
>>>>>>>> kriging_result = autoKrige(as.formula(paste(column,"~",predictors)),
>>>>>>>>
>>>>>>>>  sitesR, mask_SG)
>>>>>>>
>>>>>>>  Error in eval(expr, envir, enclos) : "x" object not found
>>>>>>>>
>>>>>>>>
>>>>>>> Why can't we use the x an y columns here?
>>>>>>> Maybe it is because the meuse and sites R data.frames don't have the
>>>>>>> same
>>>>>>> structure ?
>>>>>>>
>>>>>>>  class(meuse)
>>>>>>>>
>>>>>>>>  [1] "data.frame"
>>>>>>>
>>>>>>>  attributes(meuse)
>>>>>>>>
>>>>>>>>  $names
>>>>>>>  [1] "x"       "y"       "cadmium" "copper"  "lead"    "zinc"
>>>>>>>  "elev"
>>>>>>>
>>>>>>>  class(sitesR)
>>>>>>>>
>>>>>>>>  [1] "SpatialPointsDataFrame"
>>>>>>>
>>>>>>>  attributes(sitesR)
>>>>>>>>
>>>>>>>>  $bbox
>>>>>>> ...
>>>>>>> $proj4string
>>>>>>> ...
>>>>>>> $coords
>>>>>>> ...
>>>>>>> $data
>>>>>>>           site value cat        x       y
>>>>>>> 1Ai      81.80   1 825094.9 6796083
>>>>>>> 2     Al      38.50   2 758780.4 6851508
>>>>>>> 3     Ar     103.50   3 818973.3 6796125
>>>>>>> 4           Av      52.50   4 775136.0 6877271
>>>>>>> ...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------
>>>>
>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> grass-stats mailing list
>>>>>>> grass-stats at lists.osgeo.org
>>>>>>> http://lists.osgeo.org/mailman/listinfo/grass-stats
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>> --
>>>> Edzer Pebesma
>>>> Institute for Geoinformatics (ifgi), University of Münster
>>>> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
>>>> 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
>>>> http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de
>>>>
>>>>
>>>>
>>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the grass-stats mailing list