[GRASS-stats] Spatial autocorrelation of multi-spectral, uni- and mutli-temporal data sets

Thu Jan 6 09:56:36 EST 2011

On Sunday 26 of December 2010 19:00:32 Nikos Alexandris wrote:
> Greets to the statists,
> 
> I want to "describe" my multispectral (Landsat5_TM) composite datasets with
> respect to their between vs. within heterogeneity.

(Replying to myself and for the potential interest of anybody reading the 
list... )

Finally I went for MRPP test(s) implemented in the R-package "vegan"[1][2]. I 
did a sampling of major land cover classes after all (such as Urban areas, 
vegetation, bare ground, water bodies, etc)., put the data in data.frames and 
ran the tests.

The data.frames look like:

str ( samples_postfire_modis )
'data.frame':   1040 obs. of  6 variables:
 $ Band 1: int  1354 1458 1458 1458 1550 1145 1428 1573 1573 1657 ...
 $ Band 2: int  3088 2971 2971 2971 2902 2990 2942 2824 2824 2917 ...
 $ Band 5: int  3533 3506 3506 3506 3323 3535 3337 3239 3239 3552 ...
 $ Band 6: int  2778 2803 2803 2803 2974 2646 2674 2883 2883 3071 ...
 $ Band 7: int  2019 2146 2146 2146 2042 1719 2045 2114 2114 2373 ...
 $ Class : Factor w/ 5 levels "Urban","Vegetation",..: 1 1 1 1 1 1 1 1 1 1 ...
> str ( samples_bite )
samples_bitemporal_modis.colnames  samples_bitemporal_modis
> str ( samples_bitemporal_modis )
'data.frame':   1040 obs. of  7 variables:
 $ Prefire B2 : int  3377 3425 3304 3179 3247 3247 3235 3043 3043 3197 ...
 $ Prefire B6 : int  2726 2683 2737 2991 2934 2934 2928 2984 2984 3199 ...
 $ Prefire B7 : int  1864 1932 2005 2185 2068 2068 2223 2331 2331 2314 ...
 $ Postfire B2: int  3088 2971 2971 2971 2902 2990 2942 2824 2824 2917 ...
 $ Postfire B6: int  2778 2803 2803 2803 2974 2646 2674 2883 2883 3071 ...
 $ Postfire B7: int  2019 2146 2146 2146 2042 1719 2045 2114 2114 2373 ...
 $ Class      : Factor w/ 5 levels "Urban","Vegetation",..: 1 1 1 1 1 1 1 1 1 
1 ...

Some of the results look like this:

--%<---
Call:
mrpp(dat = samples_postfire_modis.smallsample.300[, 1:5], grouping = 
samples_postfire_modis.smallsample.300[["Class"]]) 

Dissimilarity index: euclidean 
Weights for groups:  n 

Class means and counts:

      Urban Vegetation Bare ground Burned Water
delta  1241  1029       1550        1228  855.2
n     97    81         63          53     6    

Chance corrected within-group agreement A: 0.3956 
Based on observed delta 1239 and expected delta 2050 

Significance of delta: 0.001 
Based on  999  permutations
-->%---

and

--%<---
+ samples_postfire_modis.smallsample.300.vegdist <- vegdist ( 
samples_postfire_modis.smallsample.300[,1:5] )

+ samples_postfire_modis.smallsample.300.md <- meandist ( 
samples_postfire_modis.smallsample.300.vegdist , 
samples_postfire_modis.smallsample.300[["Class"]] )

+ summary(samples_postfire_modis.smallsample.300.md)

Mean distances:
                 Average
within groups  0.0950253
between groups 0.2023877
overall        0.1754765

Summary statistics:
                        Statistic
MRPP A weights n        0.4224480
MRPP A weights n-1      0.4263591
MRPP A weights n(n-1)   0.4584728
Classification strength 0.1010409
--%<---

Running the above test on samples with observations > 3000 is a high load for 
a home-machine (working here on Core2 Duo @2.53GHz and 6GB RAM). In fact, 
running the process for 18K observations times 6 variables (the number of 
permutations increases like crazy...) took 2+1/2 days (double that for another 
data.frame of the same size).

Wish I had access to some OSGeo super-computer for 30 mins to get this job 
done.

Greets, Nikos