[GRASSLIST:5252] Re: LSD program

Roger Bivand Roger.Bivand at nhh.no
Tue Jan 4 15:21:39 EST 2005


On Tue, 4 Jan 2005, Michael Barton wrote:

> Thanks Craig.
> 
> I realize that R has the potential to be a powerful statistical companion to
> GRASS. The reasons I didn't use R for this LDA routine are:
> 
> 1. I don't know R command syntax. This seems to be a fairly steep learning
> curve, though I am sure the results are worth the effort. While I keep
> thinking it would be useful to learn it, I have so much on my plate that I
> simply haven't had time. I should learn C too. It would encourage me to
> tackle R if it had a menu-driven interface to augment the command line, as I
> am familiar with a variety of other stat packages (SPSS, Systat, Statistica,
> and JMP--the latter being my package of choice in a large part because of
> the UI). So far, I haven't found one that works on my Mac (a menu that
> primarily lets me cut and paste commands, and open help files is nice but
> not sufficient).

This is fair - although there are some GUIs being done (also using tcltk), 
this isn't a priority for developers (or users) - see the R GUI link on 
the side bar of the R home page http://www.r-project.org. 

> 
> 2. Most GRASS users don't know R command syntax. I suspect that many of them
> will echo my comments in #1. The LDA routine I did is pretty simple to use.
> Hopefully, this will make it more accessible to GRASS users.
> 

Maybe - the upside of writing a small R batch program is that the R 
internals are heavily used, and most likely less exposed to coding error 
than less used code. The downside is writing the code, though, and for 
regular work done very often, an optimized GRASS module may execute 
faster.

> 3. I'm not clear about this, but it sounds from recent comments to the GRASS
> developer's list like R will not currently read GRASS 5.7/6 vector data. The
> commands you list below, suggest that you are working with GRASS 5.4 (or
> earlier) vector data and an associated PostgreSQL database. I hope I am
> wrong about this.

You aren't right (nor wrong either as of now), 5.7/6 is certainly going to
be there, but as of now there have only been two specific requests for 5.7
vector support (in fact for points), and these can be handled through
reading and writing files.  There is a lot of ongoing work on designing
robust spatial data classes for R, and they will make exchange though OGR
at least, possibly also with vector topology, much easier to code. Classes
first (we hope this month), then interface. 5.0.* and 5.4 raster and sites
continue to be supported, though the new compression style for integer
rasters is not yet there.

> 
> Finally, perhaps you could offer some advice given your knowledge of R. LDA
> is not a clustering technique, but a way to quantitatively measure the to
> what extent geospatial data are clumped or dispersed (along the lines of
> nearest neighbor analysis but without some of the drawbacks of NN). Is there
> such a routine in R? And are there modules that measure in other ways the
> degree to which geospatial data are clumped or dispersed (evenly or
> randomly)? 

I haven't been able to get hold of the references you posted for LDA. 
There are several packages including functions for testing cluster, 
random, or dispersed. These cover the NN methods (Ghat, Fhat), but also 
Khat, which seems like your method. Khat compares the relative intensity 
of points in distance bands around each point (boundary adjusted) - I 
would guess Johnson's method is based on the same use of the data. Brian 
Ripley's "spatial" package permits rectangular window boundary adjustment, 
splancs permits boundary adjustment in an arbitrary polygon, while 
spatstat allows more options. One nice feature is that they all allow the 
testing of results by simulation from completely spatially random and 
other spatial processes.

The web references to LDA I have seen make it look as though the Khat 
components (for each point) may be of interest - these are accessible in 
the khat() function in splancs. I've seen mention of comparison of i and j 
artefacts too - this looks like a marked point process, aka k12hat. If you 
can feed me a specification and a test data set, I'll see if I can figure 
out how this could be done, if you like.

Best wishes,

Roger

> 
> Thanks for the suggestion.
> 
> Michael Barton
> 
> On 1/4/05 10:12 AM, "Craig Aumann" <caumann at ualberta.ca> wrote:
> 
> > In glancing at your script, I wondered why you didn't just import the
> > stuff into a statistical package like R and then use all the clustering
> > routines it has.    See www.r-project.org
> > 
> > 
> > 
> > 
> > ## Key commands in R are:
> > 
> > ## Load the packages to access the database, in particular the PgSQL ##
> > database
> > require(Rdbi)
> > require(RdbiPgSQL)
> > 
> > ## Load some clustering routines
> > 
> > require(cluster)
> > require(mclust)
> > 
> > 
> > ### Connect to the database and read in a points data set "jogis"
> > ## and read in the dataset "wells_att".
> > 
> > 
> > conn <- dbConnect(PgSQL(), dbname="jogis", user="caumann")
> > wells <- dbReadTable(conn, "wells_att")
> > dbDisconnect(conn)
> > 
> > ## Now you can apply any clustering or kernel smoothing technique you
> > want to the dataset.
> > 
> > If this is of any help, let me know and I can provide more details.
> > Cheers!
> > Craig
> > 
> 
> ______________________________
> Michael Barton, Professor of Anthropology
> School of Human, Evolution and Social Change
> Arizona State University
> Tempe, AZ  85287-2402
> USA
> 
> voice: 480-965-6262; fax: 480-965-7671
> www: http://www.public.asu.edu/~cmbarton
> 
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand at nhh.no





More information about the grass-user mailing list