[GRASS-user] Re: Natural breaks classification

Roger Bivand Roger.Bivand at nhh.no
Fri May 18 05:30:24 EDT 2012


Since all Spatial*DataFrame objects behave link data.frame objects, you can
say:

DTM<- readRAST6("DTM at mapset", ignore.stderr=TRUE)
DTM_class <- classIntervals(DTM.df$varOfInterest, n=5, style="fisher")

However, if nrow(DTM) is large, you may find that the "natural breaks"
classifications take a lot of time, because the number of possible
alternative classifications to compare is very large. If you can reasonably
represent the distribution of your variable of interest by a sample, do
that, but add back the maximum and minimum values. Of course, in this case
you may get different class intervals for each sample.

In spearfish, I see:

> library(spgrass6)
> spear <- readRAST6(c("geology", "elevation.dem"), cat=c(TRUE, FALSE),
+          useGDAL=TRUE)
> library(classInt)
> system.time(CI0 <- classIntervals(spear$elevation.dem, n=5,
> style="fisher"))
    user   system  elapsed 
1151.136    0.053 1153.775 
> nrow(spear)
[1] 294978
> CI0
style: fisher
  [1066,1221.5) [1221.5,1338.5) [1338.5,1472.5) [1472.5,1608.5)  
[1608.5,1840] 
          93972           60419           59562           45228          
33136 
> set.seed(1)

> system.time(CI1 <- classIntervals(c(sample(na.omit(spear$elevation.dem),
> 500), 
+ range(na.omit(spear$elevation.dem))), n=5, style="fisher"))
   user  system elapsed 
  0.017   0.002   0.019 
> CI1
style: fisher
  [1066,1215.5) [1215.5,1322.5)   [1322.5,1450)     [1450,1591)    
[1591,1840] 
            159             109              86              78             
70 
> table(findInterval(na.omit(spear$elevation.dem), CI0$brks,
> all.inside=TRUE))

    1     2     3     4     5 
88747 59755 57077 47726 39012 

Or for more precision:

> set.seed(1)
> var_na <- na.omit(spear$elevation.dem)
> mm_var <- range(var_na)
> out <- vector(mode="list", length=100)
> system.time(for (i in seq(along=out)) out[[i]] <-
> classIntervals(c(sample(var_na, 500), mm_var), n=5, style="fisher"))
   user  system elapsed 
  0.312   0.000   0.316 
> mbrks <- sapply(out, "[[", "brks")
> str(mbrks)
 num [1:6, 1:100] 1066 1216 1322 1450 1591 ...
> apply(mbrks, 1, mean)
[1] 1066.000 1218.920 1337.745 1473.920 1612.390 1840.000

which gives a result very like the one using the complete data set but
taking over three orders of magnitude longer, using 100 samples of 500
raster cells each.

Hope this helps,

Roger



Salvatore Mellino wrote
> 
> Hi,
> 
> thanks a lot. Now the process started, but the operation DTM_class <-
> classIntervals(DTM.df$varOfInterest, n=5, style="fisher" takes too long;
> it was started by hours but does not end. Is it normal?
> 
> Salvatore
> 
> 
> Il giorno 16/mag/2012, alle ore 22:04, Luigi Ponti ha scritto:
> 
>> On 16/05/2012 21:10, Luigi Ponti wrote:
>>> On 16/05/2012 20:55, Salvatore Mellino wrote:
>>>> I'm trying but I receive an error.
>>>> 
>>>> 1. In R I started the libraries spgarass6 and classInt
>>>> 2. I imported the DTM in R using           DTM<-
>>>> readRAST6("DTM at mapset", ignore.stderr=TRUE)
>>>> 3. I used the command                DTM_class<- classIntervals(DTM,
>>>> n=5, style="fisher")             and I received the error
>>>> 
>>>> Error in classIntervals(DTM, n = 5, style = "fisher") :
>>>>   var is not numeric
>>> 
>>> I do not recall exactly, but it may be that classIntervals does not have
>>> a method for spatial data frames (i.e. the output of readRAST6). One
>>> thing I would try to get the breaks anyway, is a regular R data.frame,
>>> for example:
>>> 
>>> DTM.df <- as.data.frame(DTM)
>>> DTM_class <- classIntervals(DTM.df, n=5, style="fisher")
>>> 
>> 
>> I am replying to myself as I may have omitted part of the answer -- my
>> apologies. Start the same way:
>> 
>>    DTM.df <- as.data.frame(DTM)
>> 
>> then get variable names:
>> 
>>    names(deltaYieldVector.df)
>> 
>> then use the variable of interest (altitude, I guess):
>> 
>>    DTM_class <- classIntervals(DTM.df$varOfInterest, n=5, style="fisher")
>> 
>> This should work; we were feeding a data.frame as input to
>> classIntervals, whereas a variable is needed (e.g. one of the variables
>> in the data.frame).
>> 
>> Let me know how it goes.
>> 
>> Kind regards,
>> 
>> Luigi
> 
> 
> _______________________________________________
> grass-user mailing list
> grass-user at .osgeo
> http://lists.osgeo.org/mailman/listinfo/grass-user
> 


--
View this message in context: http://osgeo-org.1560.n6.nabble.com/Natural-breaks-classification-tp4975148p4975552.html
Sent from the Grass - Users mailing list archive at Nabble.com.


More information about the grass-user mailing list