[GRASS-stats] clustering

Roger Bivand Roger.Bivand at nhh.no
Thu Mar 6 02:47:41 EST 2008


On Wed, 5 Mar 2008, Jarek Jasiewicz wrote:

> Hi
>
> question about clustering : I try to use R to clustering for large raster 
> data set (40000 px at least, commonly 2 500 000) but from package cluster 
> only` clara `works (works fine and fast about 2 seconds on largest data set) 
> and no other methods (daisy first of all) work. The message I recive is that 
> `vector is too long` probably it means too large...
>
> Question is simple is there any alternative for cluster package (I think 
> about fuzzy clasifications) for R? Generally in R are more than few 
> culustering packages, but before I try to test them, with such dataset I 
> shall rather look for another tool?
>

Have you looked at the Cluster Task View?

http://cran.r-project.org/web/views/Cluster.html

Admittedly, many of the methods in these packages are not written to scale 
up, but to illustrate the implementation of methods in principle. If you 
try to start a list of methods that don't fail for 40K, and contact the 
maintainer of the task view (Bettina Gruen), I expect that she and 
Friedrich Leisch would consider including a paragraph on the suitability 
of the methods they list for larger data sets. I would also look to see 
whether the Bioconductor statistics task view:

http://www.bioconductor.org/packages/release/Statistics.html

which has a clustering subview, is relevant - gene array data are also 
typically very large. In general, I think that "cluster" as a word is 
often used with "machine learning" and "pattern recognition", so the 
references and links may be rather disorganised.

Hope this helps,

Roger

>
> Jarek
>
>
> _______________________________________________
> grass-stats mailing list
> grass-stats at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-stats
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the grass-stats mailing list