[GRASS-stats] universal kriging with spgrass6
Roger Bivand
Roger.Bivand at nhh.no
Mon Jun 29 11:33:32 EDT 2009
On Mon, 29 Jun 2009, mathieu grelier wrote:
> Dear all,
> Please read this thread again and understand I'm definitely not aggressive.
> I've just proposed a wrong spgrass syntax, based on a working R code.
> I completely agree with your perspective and respect your great work. Of
> course, no problem with that, I know what open source is, and precisely, I
> am quite tired of this kind of discussions.
>
> I just thought this list was accessible to simple GRASS users. From this
> point of view, even some mistakes may appear too big for specialists, It
> could be a constructive dialog between two communities. I also meant
> that from the engineering point of view, design is adapted to users, not
> the contrary, and you can always learn from users mistakes how to
> improve your design and doc. But this is another debate : again I really
> appreciate your software.
These are areas where there is no firm knowledge or implementations, so
there is no "design" apart from the formulae, from there implementation,
and the difficulties of implementing on hardware.
>
> I'm just requesting help for syntax and never R libs consultancy.
These are the same thing, when, as is the case here, your requests go
beyond what syntax permits. If you read to the end of my reply, you would
have seen a possible remedy by re-scaling the coordinates:
library(automap)
data(meuse)
coordinates(meuse) <- c("x", "y")
data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
gridded(meuse.grid) <- TRUE
meuse$east <- coordinates(meuse)[,1]
re_EAST <- (max(meuse$east) - min(meuse$east))
meuse$re_east <- meuse$east - min(meuse$east)/re_EAST
meuse$north <- coordinates(meuse)[,2]
re_NORTH <- (max(meuse$north) - min(meuse$north))
meuse$re_north <- meuse$north - min(meuse$north)/re_NORTH
meuse.grid$east <- coordinates(meuse.grid)[,1]
meuse.grid$north <- coordinates(meuse.grid)[,2]
meuse.grid$re_east <- meuse.grid$east - min(meuse$east)/re_EAST
meuse.grid$re_north <- meuse.grid$north - min(meuse$north)/re_NORTH
o0 <- autoKrige(formula=zinc ~ 1, input_data=meuse, new_data=meuse.grid)
plot(o0)
o1 <- autoKrige(formula=zinc ~ re_east + re_north, input_data=meuse,
new_data=meuse.grid)
plot(o1)
o2 <- autoKrige(formula=zinc ~ re_east + re_north + I(re_east*re_north) +
I(re_east^2) + I(re_north^2), input_data=meuse, new_data=meuse.grid)
plot(o2)
> If the solution is too complex, please just indicate it, without any
> condescension, and people won't insist.
That isn't how science works, nothing is "too complex", but with given
implementations, it does take time (like everything).
>
> I now clearly understood that is a quite hard problem so I have my answer.
> When I say "*it should not be for me to understand how*", I mean UK is
> certainly an interesting feature for GRASS, not just for me. So I am
> surprised to see It is like I am the first to ask.
>
You aren't, others have got similar answers - the specific stumbling block
is that variogram() and degree= in gstat don't play well together, for
good reasons.
Roger
> Best regards.
> Mathieu
>
> 2009/6/29 Roger Bivand <Roger.Bivand at nhh.no>
>
>> On Mon, 29 Jun 2009, mathieu grelier wrote:
>>
>> ok
>>> First I think everyone here has its own domain of competence, so please
>>> don't answer as if I was at school, I'm not sure all these classes are so
>>> obvious for all GRASS users. Please clearly display the R requirements to
>>> post in the list if they exist
>>> I just try to implement UK in a GRASS process for a larger mapping system
>>> architecture. I think in some way it could be feasible : it should not be
>>> for me to understand how.
>>>
>>
>> It is always the responsibility of the implementer to back-track to where
>> his or her understanding is wrong. None of the open source software that you
>> are using comes with any warranty whatsoever.
>>
>> You can do UK, but not trend surface, in a predictable way, because both
>> the data= and newdata= objects have to contain the same variables.
>>
>> Doing UK with trend surfaces is different, because the "proper" gstat
>> argument is degree=, not including the coordinates as variables. The reason
>> for this is numeric, products and powers of large numbers, like meters from
>> the Equator, overflow double precision very quickly. When degree= is given,
>> the code internally rescales the coordinates before taking powers and
>> products to avoid these numerical problems - in gstat source src/data.c, the
>> calc_polynomial() function rescales - for example:
>>
>> x' = (x-min(x))/(max(x)-min(x))
>>
>> If what you want to do is UK with a trend surface, you will have to work
>> though this from the bottom up. In fact, I'm surprised that autoKrige() uses
>> krige() in gstat, rather than first making a gstat() object then predicting
>> from it, making it harder to see what is going on. I suspect that some of
>> the difficulties here are actually caused by:
>>
>> library(gstat)
>> data(meuse)
>> coordinates(meuse) <- c("x", "y")
>> m1 <- gstat(id="m1", formula=zinc ~ 1, data=meuse, degree=1)
>> v1 <- variogram(m1)
>> #Error in variogram.gstat(m1) :
>> #degree != 0: residual variograms wrt coord trend using degree not
>> supported
>>
>> which prevents autofitVariogram() from doing what it needs to in the
>> rescaled trend surface case. This is I think supported in free-standing
>> gstat, but not in the R package, I believe.
>>
>> data(meuse.grid)
>> coordinates(meuse.grid) <- c("x", "y")
>> m1p <- predict(m1, meuse.grid)
>>
>> does work happily, but getting to UK with a trend surface is messy, as
>> you've found. It may be that I'm wrong, and that the gstat() formula
>> interface does recognise "x" and "y" as coordinates - it seems to, and that
>> this carries forward to prediction, but I can't see where this is in the
>> gstat package R or C code. The comment in src/s.c that degree > 0 will only
>> work "for a single variable" and "when no other predictors are given"
>> suggests that doing UK with a trend surface using a variogram to fit a
>> variogram model isn't so easy after all.
>>
>>
>>> Also always consider improving your doc and/or your design : it shouldn't
>>> be
>>> normal for a non-specialist to spend so much time to understand how to
>>> manipulate the objects to achieve UK with GRASS.
>>>
>>>
>> Welcome to reality, sorry, but this really is how things are, even in
>> proprietary software, you'll need to know, or reverse-engineer, how things
>> do work, in order to have adequate confidence in your results.
>>
>> Obviously I don't have time to study the R internals and I've just tried
>>> to
>>> quickly reproduce the working R code with meuse dataset, which use a
>>> dataframe as input data.
>>> By the way, the rest of the code is yours, you personally gave me at the
>>> time.
>>>
>>
>> That doesn't mean it works, nobody is infallible. There are consequences of
>> design choices often made for different purposes long ago. The degree=
>> problem is one, another is column names, neither have obvious solutions.
>> Having it work in automap probably means adding checks in autoKrige(), and
>> then altering variogram() and other functions in gstat() to do intial checks
>> and pass the degree= value through, so changes to two underlying packages.
>> If you can volunteer patches to them, this may get resolved faster.
>>
>> Alternatively, you could do the rescaling yourself first for both the data=
>> and the newdata= arguments to autoKrige(), having first added X and Y
>> coordinate variables to the objects. Check first whether you can get the
>> same trend surface predictions from my code above, from rescaled variables
>> in the same format, and finally through autoKrige() - though here I can't
>> see how to just do a trend surface without the variogram fitting
>> (model=NULL?).
>>
>> Roger
>>
>>
>> I understood that this is a colunm name problem and now I'm no longer sure
>>> what I want to do is possible without modification in gstat.
>>> But I thought maybe there was a workaround?
>>>
>>> Now I see it makes no sense but the initial R example seems to work with a
>>> data.frame as input data and it induced me in error.
>>> Best regards.
>>>
>>> Mathieu
>>>
>>>
>>> 2009/6/29 Roger Bivand <Roger.Bivand at nhh.no>
>>>
>>> On Mon, 29 Jun 2009, mathieu grelier wrote:
>>>>
>>>> ok,
>>>>
>>>>> From your answers, I've tried to play with the SpatialPointsDataFrame
>>>>> object
>>>>> imported from GRASS to fit the gstat krige function requirements. Basic
>>>>> idea
>>>>> seemed to work with the "data" attribute of the imported object, adding
>>>>> the
>>>>> missing coordinates columns :
>>>>>
>>>>> sitesR <- readVECT6("grass_sites")
>>>>>
>>>>>> class(sitesR)
>>>>>> [1] "SpatialPointsDataFrame"
>>>>>> d = attr(sitesR,"data")
>>>>>> d
>>>>>>
>>>>>>
>>>>> site value cat
>>>>> 1 Ail 81.80 1
>>>>> 2 All 38.50 2
>>>>> 3 Arg 103.50 3
>>>>> 4 Avb 52.50 4
>>>>> ...
>>>>>
>>>>> sitesR$x <- coordinates(sitesR)[,1]
>>>>>
>>>>>> sitesR$y <- coordinates(sitesR)[,2]
>>>>>> d = attr(sitesR,"data")
>>>>>>
>>>>>>
>>>>> **NO**, there are no attr() in an S4 class. Just do the obvious thing:
>>>>
>>>> d <- as(sitesR, "data.frame")
>>>>
>>>> **BUT** you shouldn't be doing this at all, the data= argument in gstat
>>>> can
>>>> take a SpatialPointsDataFrame.
>>>>
>>>> class(d)
>>>>
>>>>>
>>>>>> [1] "data.frame"
>>>>>
>>>>> d
>>>>>>
>>>>>> site value cat x y
>>>>> 1 Ail 81.80 1 825094.9 6796083
>>>>> 2 All 38.50 2 758780.4 6851508
>>>>> 3 Arg 103.50 3 818973.3 6796125
>>>>> 4 Avb 52.50 4 775136.0 6877271
>>>>> ...
>>>>>
>>>>> coordinates(d) = ~x+y
>>>>>
>>>>>>
>>>>>>
>>>>> (Check, it has lost its x and y columns again, hasn't it?)
>>>> str(d)
>>>>
>>>> attr(d, "proj4string") <-CRS(G$proj4)
>>>>
>>>>>
>>>>>>
>>>>> **NO**!!! These are S4 classes, no attr()!!!
>>>>
>>>>
>>>> G <- gmeta6()
>>>>>
>>>>>> grd <- gmeta2grd()
>>>>>>
>>>>>>
>>>>> **PLEASE** don't do this! You cannot set one slot without resetting the
>>>> others if you want to cover the GRASS region. If you want something
>>>> different, use g.region in GRASS and just gmeta2grd().
>>>>
>>>> slot(grd, "cellsize") <- c(1000, 1000)
>>>>
>>>>> data <- data.frame(list(k=rep(1,(floor(G$cols)*floor(G$rows)))))
>>>>>> mask_SG <- SpatialGridDataFrame(grd, data=data, CRS(G$proj4))
>>>>>>
>>>>>>
>>>>>
>>>>> What is this supposed to be? What we've told you already is that
>>>> mask_SG
>>>> needs to be a Spatial*DataFrame with "x" and "y" columns, why are you not
>>>> doing it?
>>>>
>>>>
>>>> predictors = "x+y"
>>>>>
>>>>>> kriging_result = autoKrige(as.formula(paste(column,"~",predictors)), d,
>>>>>>
>>>>>> mask_SG)
>>>>>
>>>>> Result was :
>>>>>
>>>>> Error in gstat.formula.predict(d$formula, newdata, na.action =
>>>>> na.action,
>>>>> :
>>>>>
>>>>> NROW(locs) != NROW(X): this should not occur
>>>>> Warning messages:
>>>>> 1: 'newdata' had 7900 rows but variable(s) found have 73 rows
>>>>> 2: 'newdata' had 7900 rows but variable(s) found have 73 rows
>>>>>
>>>>>
>>>>> No traceback()? Why? Isn't it obvious that it has found the x and y
>>>> values
>>>> in d (73 long)?
>>>>
>>>> 1)do you think this approach could be successful?
>>>>
>>>>> 2)If so, can you help me with this error?
>>>>>
>>>>>
>>>> Try doing everything in very small steps manually. Use gstat not automap,
>>>> to avoid the extra layer of uncertainty. Simply try to set up a gstat()
>>>> object for trend surface and predict from it:
>>>>
>>>> g_tr1a <- gstat("tr1a", <whatever column was> ~ 1, d, degree=1)
>>>> pr_tr1a <- predict(g_tr1a, mask_SG)
>>>>
>>>> Once that (untried) works, go further, not until. I still suspect that
>>>> this
>>>> is to do with the colnames of the coordinates, (x, y) in one case and
>>>> (coords1, coords2) or some such in the other.
>>>>
>>>> Please do study the gstat help pages and the examples, they will help,
>>>> but
>>>> you are jumping to far too many conclusions.
>>>>
>>>> Roger
>>>>
>>>>
>>>>
>>>> Mathieu
>>>>>
>>>>> 2009/6/26 Edzer Pebesma <edzer.pebesma at uni-muenster.de>
>>>>>
>>>>>
>>>>>
>>>>>> Roger Bivand wrote:
>>>>>>
>>>>>> On Thu, 25 Jun 2009, Edzer Pebesma wrote:
>>>>>>>
>>>>>>> Mathieu, that could very well be the case. Note that x+y do not
>>>>>>>
>>>>>>>> "generally" refer to x- and y-coordinates, but are the actual names
>>>>>>>> of
>>>>>>>> the coordinates (or something else). From your example I cannot find
>>>>>>>> out
>>>>>>>> whether the coordinates are named x and y. In addition, they should
>>>>>>>> have
>>>>>>>> identical names in sitesR and mask_SG.
>>>>>>>>
>>>>>>>>
>>>>>>> Yes, it is beginning to look like the names in mask_SG, isn't it? Do
>>>>>>> you have a view on whether degree= should be mandatory, or does the
>>>>>>> infrastructure recognise coordinate RHS variables and rescale them
>>>>>>> automatically (I doubt the latter)? Why do users use trend variables
>>>>>>> in this way - the range of values in the (X'X) matrix becomes enormous
>>>>>>> with the usual metre metric for coordinates?
>>>>>>>
>>>>>>> Another question we could ask (ourselves) is why in R we allow users
>>>>>> to
>>>>>> give their coordinates the names they like. I remember that for that
>>>>>> sole reason (need to rename) I added the function
>>>>>>
>>>>>> coordnames(x) = c("u", "v")
>>>>>>
>>>>>> to sp. In case they're called "lon" and "lat", they also don't change
>>>>>> name after projection. But maybe I talk too much with people worrying
>>>>>> about semantics, these days. ;-)
>>>>>> --
>>>>>> Edzer
>>>>>>
>>>>>>
>>>>>>> Roger
>>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>>> --
>>>>>>>> Edzer
>>>>>>>>
>>>>>>>> mathieu grelier wrote:
>>>>>>>>
>>>>>>>> Dear all,
>>>>>>>> I am trying to achieve universal kriging with GRASS and R through
>>>>>>>> spgrass6
>>>>>>>> package.
>>>>>>>> Maybe you could help me to find the right R syntax.
>>>>>>>> As in the gstat doc, the polynomial I try to use for now is "x+y"
>>>>>>>>
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>>> I am trying to achieve universal kriging with GRASS and R through
>>>>>>>>> spgrass6
>>>>>>>>> package.
>>>>>>>>> Maybe you could help me to find the right R syntax.
>>>>>>>>> As in the gstat doc, the polynomial I try to use for now is "x+y"
>>>>>>>>>
>>>>>>>>> In R, with automap package, this works :
>>>>>>>>>
>>>>>>>>> data(meuse)
>>>>>>>>>
>>>>>>>>>> coordinates(meuse) = ~x+y
>>>>>>>>>> data(meuse.grid)
>>>>>>>>>> gridded(meuse.grid) = ~x+y
>>>>>>>>>> column = "zinc"
>>>>>>>>>>
>>>>>>>>>> ##Ordinary kriging :
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> predictors = "1"
>>>>>>>>>
>>>>>>>>>> autoKrige(as.formula(paste(column,"~", predictors)), meuse,
>>>>>>>>>> meuse.grid)
>>>>>>>>>>
>>>>>>>>>> ##Universal kriging :
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> predictors = "x+y"
>>>>>>>>>
>>>>>>>>>> autoKrige(as.formula(paste(column,"~", predictors)), meuse,
>>>>>>>>>> meuse.grid)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> But using spgrass6, it becomes (some steps have been skipped):
>>>>>>>>>
>>>>>>>>> sitesR <- readVECT6("grass_sites")
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> mask_SG <- SpatialGridDataFrame(grd, data=data, CRS(G$proj4))
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ##Ordinary kriging => WORKS:
>>>>>>>>>
>>>>>>>>> predictors = "1"
>>>>>>>>>
>>>>>>>>>> kriging_result =
>>>>>>>>>> autoKrige(as.formula(paste(column,"~",predictors)),
>>>>>>>>>>
>>>>>>>>>> sitesR, mask_SG)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ##Universal kriging => ERROR:
>>>>>>>>>
>>>>>>>>> predictors = "x+y"
>>>>>>>>>
>>>>>>>>>> kriging_result =
>>>>>>>>>> autoKrige(as.formula(paste(column,"~",predictors)),
>>>>>>>>>>
>>>>>>>>>> sitesR, mask_SG)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Error in eval(expr, envir, enclos) : "x" object not found
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Why can't we use the x an y columns here?
>>>>>>>>> Maybe it is because the meuse and sites R data.frames don't have the
>>>>>>>>> same
>>>>>>>>> structure ?
>>>>>>>>>
>>>>>>>>> class(meuse)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] "data.frame"
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> attributes(meuse)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> $names
>>>>>>>>>>
>>>>>>>>> [1] "x" "y" "cadmium" "copper" "lead" "zinc"
>>>>>>>>> "elev"
>>>>>>>>>
>>>>>>>>> class(sitesR)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] "SpatialPointsDataFrame"
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> attributes(sitesR)
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> $bbox
>>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>> $proj4string
>>>>>>>>> ...
>>>>>>>>> $coords
>>>>>>>>> ...
>>>>>>>>> $data
>>>>>>>>> site value cat x y
>>>>>>>>> 1Ai 81.80 1 825094.9 6796083
>>>>>>>>> 2 Al 38.50 2 758780.4 6851508
>>>>>>>>> 3 Ar 103.50 3 818973.3 6796125
>>>>>>>>> 4 Av 52.50 4 775136.0 6877271
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> grass-stats mailing list
>>>>>>>>> grass-stats at lists.osgeo.org
>>>>>>>>> http://lists.osgeo.org/mailman/listinfo/grass-stats
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> --
>>>>>> Edzer Pebesma
>>>>>> Institute for Geoinformatics (ifgi), University of Münster
>>>>>> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
>>>>>> 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
>>>>>> http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Roger Bivand
>>>> Economic Geography Section, Department of Economics, Norwegian School of
>>>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>>>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>> e-mail: Roger.Bivand at nhh.no
>>>>
>>>>
>>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the grass-stats
mailing list