[GRASS-stats] Testing i.pca (continued...)

Nikos Alexandris nikos.alexandris at felis.uni-freiburg.de
Wed Apr 29 13:32:11 EDT 2009


Hi all! Another i.pca -vs- prcomp() round :-)

# 1. There *must* be something in i.pca's code that prevents data
centering when using MODIS data???

  I tested... tested... & tested... (almost all tests below) and I just
can't understand WHY i.pca does NOT perform data centering when using
MODIS data (raw, rescaled at 0~255 or 0~254). All I get is:

  i.pca == prcomp(x, center=FALSE, scale=FALSE)


# 2. Note: there is _NO_ problem when using the spot or landsat5 data
from the spearfish location!

# 3. Can some programmer have a look at i.pca to check whether there is
a switch to (de-)activate data centering?

I don't want to update the wiki-page before I understand what is going
on. Any help?

Kindest regards, Nikos
=======================================================================

Quickly:

* using MODIS data I get:
-------------------------

 ** i.pca == prcomp(x, center=FALSE, scale=FALSE)

 ** if I center the data manually (as suggested by Agus), i.e.:

r.mapcalc "mod_b2_centered = mod_b2 - mean(mod_b2)"

  then I get:

i.pca(on centered data) == prcomp(x, center=TRUE, scale=FALSE)


 ** it also works if I further scale manually the data like:

r.mapcalc "mod_b2_centered_scaled = mod_b2_centered/stdev"



* using spot or landsat5 data I get:
------------------------------------

 ** i.pca == prcomp(x, center=TRUE, scale=FALSE)



##########################################
##########################################

Principal Component Analysis:
comparison between grass' i.pca and R's prcomp()

##########################################
##########################################

Testing... (Wednesday, April 29, 11:58 AM)

# 1
i.pca on

# raw postfire modis bands 2, 6 and 7
i.pca input=mod07_b2,mod07_b6,mod07_b7 out=pca_mod_b267

# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 6307563.04 (-0.6353,-0.6485,-0.4192) [98.71%]
PC2   78023.63 (-0.7124, 0.2828, 0.6422) [ 1.22%]
PC3    4504.60 (-0.2979, 0.7067,-0.6417) [ 0.07%]


# 2
prcomp() on

# raw postfire modis bands 2, 6 and 7

# center=FALSE, scale=FALSE
prcomp(mod07_b267, center=FALSE, scale=FALSE) ## This is the same as
with i.pca on raw(=untouched) data

Standard deviations:
[1] 4279.0259  475.9080  114.2998

Rotation:
                PC1        PC2        PC3
mod07_b2 -0.6353496  0.7124426 -0.2979203
mod07_b6 -0.6485310 -0.2828396  0.7066890
mod07_b7 -0.4192117 -0.6422051 -0.6417431

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

# 3
i.pca on

# centered postfire modis bands 2, 6 and 7
i.pca input=mod07_b2_centered,mod07_b6_centered,mod07_b7_centered
out=pca_mod07_b267_centered

# output
Eigen values, (vectors), and [percent importance]:
PC1 270343.07 (-0.4403,-0.7222,-0.5335) [79.11%]
PC2  67140.50 (-0.8275, 0.0957, 0.5533) [19.65%]
PC3   4258.14 ( 0.3485,-0.6851, 0.6397) [ 1.25%]


# 4
prcomp() on

# raw postfire modis bands 2, 6 and 7 with center=TRUE, scale=FALSE
prcomp(mod07_b267, center=TRUE, scale=FALSE) ## This is the same as
using prcomp() on centered data

Standard deviations:
[1] 882.1814 438.7420 108.9791

Rotation:
               PC1         PC2        PC3
mod07_b2 0.4372107  0.83099407 -0.3439413
mod07_b6 0.7210155 -0.09527873  0.6863371
mod07_b7 0.5375718 -0.54806096 -0.6408165


# 5
princomp() on

# raw postfire modis bands 2, 6 and 7
print(loadings(princomp(mod07_b267)), digits=6, cutoff=0.001)

Loadings:
         Comp.1    Comp.2    Comp.3   
mod07_b2 -0.437211  0.830994  0.343941
mod07_b6 -0.721016 -0.095279 -0.686337
mod07_b7 -0.537572 -0.548061  0.640817

                 Comp.1   Comp.2   Comp.3
SS loadings    1.000000 1.000000 1.000000
Proportion Var 0.333333 0.333333 0.333333
Cumulative Var 0.333333 0.666667 1.000000

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

# 6
prcomp() on

# centered postfire modis bands 2, 6 and 7 with center=FALSE,
scale=FALSE
prcomp(mod07_b267_centered, center=FALSE, scale=FALSE) ## In this
example the data are already centered. This is the same as using
prcomp(x, center=TRUE, scale=FALSE) on raw data or i.pca on centered
data

Standard deviations:
[1] 882.1844 438.7430 108.9810

Rotation:
                        PC1         PC2        PC3
mod07_b2_centered 0.4372131  0.83099137 -0.3439448
mod07_b6_centered 0.7210164 -0.09527912  0.6863360
mod07_b7_centered 0.5375686 -0.54806499 -0.6408157


# 7
prcomp() on

# centered postfire modis bands 2, 6 and 7 with center=TRUE,
scale=FALSE 
prcomp(mod07_b267_centered, center=TRUE, scale=FALSE) ## The parameter
"center=TRUE" has no effect in this example since the data used are
already centered 

Standard deviations:
[1] 882.1814 438.7420 108.9792

Rotation:
                        PC1         PC2        PC3
mod07_b2_centered 0.4372107  0.83099407 -0.3439413
mod07_b6_centered 0.7210155 -0.09527872  0.6863371
mod07_b7_centered 0.5375718 -0.54806097 -0.6408165

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 


### ### ###

Comments

Using already centered data does not make sense to use the parameter
"center=". Thus the only difference will be between _scaled_ and
_unscaled_ data prior to PCA.

### ### ###


~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

# 8
i.pca on

# scaled postfire modis bands 2, 6 and 7
i.pca input=mod07_b2_scaled,mod07_b6_scaled,mod07_b7_scaled
out=pca_mod07_b267_scaled

# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 19.09 (-0.6824,-0.5773,-0.4484) [98.57%]
PC2  0.26 ( 0.6786,-0.2724,-0.6821) [ 1.36%]
PC3  0.01 ( 0.2716,-0.7698, 0.5777) [ 0.07%]


# 9
prcomp() on

# scaled postfire modis bands 2, 6 and 7 with center=FALSE, scale=TRUE
prcomp(mod07_b267, center=FALSE, scale=TRUE)

Standard deviations:
[1] 1.71888790 0.20789109 0.04696487

Rotation:
                PC1         PC2        PC3
mod07_b2 -0.5747633  0.74019177 -0.3489459
mod07_b6 -0.5812894 -0.06916484  0.8107520
mod07_b7 -0.5759772 -0.66882910 -0.4700191

# get eigenvalues
summary(prcomp(mod07_b267, center=FALSE, scale=TRUE))
Importance of components:
                         PC1    PC2     PC3
...
Proportion of Variance 0.985 0.0144 0.00074
...


# 10
prcomp() on
# scaled postfire modis bands 2, 6 and 7
prcomp(mod07_b267_scaled, center=FALSE) ## This is the same as with
i.pca on manually scaled data

Standard deviations:
[1] 7.4449990 0.8749921 0.1925904

Rotation:
                       PC1        PC2        PC3
mod07_b2_scaled -0.6824314  0.6786305 -0.2715660
mod07_b6_scaled -0.5772537 -0.2724487  0.7697726
mod07_b7_scaled -0.4484034 -0.6820795 -0.5776695

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

# 11
i.pca on

# re-scaled to 0 ~ 255 postfire modis bands 2, 6 and 7

###
# first r.rescale to=0,255
# then r.mapcalc to ensure that 0,255 values
###

i.pca
input=mod07_b2_rescaled255,mod07_b6_rescaled255,mod07_b7_rescaled255
out=pca_mod07_b267_rescaled255

# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 15236.52 (-0.5972,-0.6037,-0.5282) [98.52%]
PC2   216.16 ( 0.7340,-0.1457,-0.6634) [ 1.40%]
PC3    13.01 ( 0.3235,-0.7838, 0.5301) [ 0.08%]


# 12
# re-scaled to 0 ~ 254 postfire modis bands 2, 6 and 7
input=mod07_b2_rescaled254,mod07_b6_rescaled254,mod07_b7_rescaled254
out=pca_mod07_b267_rescaled254

# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 15117.39 (-0.5972,-0.6037,-0.5282) [98.52%]
PC2 214.43 ( 0.7341,-0.1458,-0.6632) [ 1.40%]
PC3 12.90 ( 0.3234,-0.7838, 0.5302) [ 0.08%]
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 

# 13
i.pca on

# centered and scaled postfire modis bands 2, 6 and 7
i.pca
input=mod07_b2_centered_scaled,mod07_b6_centered_scaled,mod07_b7_centered_scaled out=pca_mod07_b267_centered_scaled
## This is the same as performing prcomp(x, center=TRUE, scaled=TRUE)

# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 0.79 (-0.4907,-0.6521,-0.5779) [76.30%]
PC2 0.23 (-0.8120, 0.1015, 0.5748) [22.52%]
PC3 0.01 ( 0.3162,-0.7513, 0.5793) [ 1.18%]


# 14
prcomp() on

# center=TRUE, scale=TRUE
prcomp(mod07_b267, center=TRUE, scale=TRUE) ## This is the same as
performing i.pca on centered and scaled data

Standard deviations:
[1] 1.5137992 0.8210167 0.1853207

Rotation:
               PC1        PC2        PC3
mod07_b2 0.4896414  0.8145817 -0.3109792
mod07_b6 0.6516766 -0.1049365  0.7512030
mod07_b7 0.5792831 -0.5704779 -0.5822250

# get eigenvalues
summary(prcomp(mod07_b267, center=TRUE, scale=TRUE))
Importance of components:
                         PC1   PC2    PC3
...
Proportion of Variance 0.764 0.225 0.0115
...

# 15
princomp() on

# raw postfire modis bands 2, 6 and 7 based on the covariance matrix
print(loadings(princomp(mod07_b267, cor=TRUE)), digits=6, cutoff=0.001)
## This is the same as performing prcomp(x, center=TRUE, scale=TRUE) or
i.pca on manually centered and scaled data

Loadings:
         Comp.1    Comp.2    Comp.3   
mod07_b2 -0.489641  0.814582  0.310979
mod07_b6 -0.651677 -0.104936 -0.751203
mod07_b7 -0.579283 -0.570478  0.582225

                 Comp.1   Comp.2   Comp.3
SS loadings    1.000000 1.000000 1.000000
Proportion Var 0.333333 0.333333 0.333333
Cumulative Var 0.333333 0.666667 1.000000



More information about the grass-stats mailing list