[GRASS-stats] Testing i.pca (continued...)
Nikos Alexandris
nikos.alexandris at felis.uni-freiburg.de
Wed Apr 29 13:32:11 EDT 2009
Hi all! Another i.pca -vs- prcomp() round :-)
# 1. There *must* be something in i.pca's code that prevents data
centering when using MODIS data???
I tested... tested... & tested... (almost all tests below) and I just
can't understand WHY i.pca does NOT perform data centering when using
MODIS data (raw, rescaled at 0~255 or 0~254). All I get is:
i.pca == prcomp(x, center=FALSE, scale=FALSE)
# 2. Note: there is _NO_ problem when using the spot or landsat5 data
from the spearfish location!
# 3. Can some programmer have a look at i.pca to check whether there is
a switch to (de-)activate data centering?
I don't want to update the wiki-page before I understand what is going
on. Any help?
Kindest regards, Nikos
=======================================================================
Quickly:
* using MODIS data I get:
-------------------------
** i.pca == prcomp(x, center=FALSE, scale=FALSE)
** if I center the data manually (as suggested by Agus), i.e.:
r.mapcalc "mod_b2_centered = mod_b2 - mean(mod_b2)"
then I get:
i.pca(on centered data) == prcomp(x, center=TRUE, scale=FALSE)
** it also works if I further scale manually the data like:
r.mapcalc "mod_b2_centered_scaled = mod_b2_centered/stdev"
* using spot or landsat5 data I get:
------------------------------------
** i.pca == prcomp(x, center=TRUE, scale=FALSE)
##########################################
##########################################
Principal Component Analysis:
comparison between grass' i.pca and R's prcomp()
##########################################
##########################################
Testing... (Wednesday, April 29, 11:58 AM)
# 1
i.pca on
# raw postfire modis bands 2, 6 and 7
i.pca input=mod07_b2,mod07_b6,mod07_b7 out=pca_mod_b267
# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 6307563.04 (-0.6353,-0.6485,-0.4192) [98.71%]
PC2 78023.63 (-0.7124, 0.2828, 0.6422) [ 1.22%]
PC3 4504.60 (-0.2979, 0.7067,-0.6417) [ 0.07%]
# 2
prcomp() on
# raw postfire modis bands 2, 6 and 7
# center=FALSE, scale=FALSE
prcomp(mod07_b267, center=FALSE, scale=FALSE) ## This is the same as
with i.pca on raw(=untouched) data
Standard deviations:
[1] 4279.0259 475.9080 114.2998
Rotation:
PC1 PC2 PC3
mod07_b2 -0.6353496 0.7124426 -0.2979203
mod07_b6 -0.6485310 -0.2828396 0.7066890
mod07_b7 -0.4192117 -0.6422051 -0.6417431
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
# 3
i.pca on
# centered postfire modis bands 2, 6 and 7
i.pca input=mod07_b2_centered,mod07_b6_centered,mod07_b7_centered
out=pca_mod07_b267_centered
# output
Eigen values, (vectors), and [percent importance]:
PC1 270343.07 (-0.4403,-0.7222,-0.5335) [79.11%]
PC2 67140.50 (-0.8275, 0.0957, 0.5533) [19.65%]
PC3 4258.14 ( 0.3485,-0.6851, 0.6397) [ 1.25%]
# 4
prcomp() on
# raw postfire modis bands 2, 6 and 7 with center=TRUE, scale=FALSE
prcomp(mod07_b267, center=TRUE, scale=FALSE) ## This is the same as
using prcomp() on centered data
Standard deviations:
[1] 882.1814 438.7420 108.9791
Rotation:
PC1 PC2 PC3
mod07_b2 0.4372107 0.83099407 -0.3439413
mod07_b6 0.7210155 -0.09527873 0.6863371
mod07_b7 0.5375718 -0.54806096 -0.6408165
# 5
princomp() on
# raw postfire modis bands 2, 6 and 7
print(loadings(princomp(mod07_b267)), digits=6, cutoff=0.001)
Loadings:
Comp.1 Comp.2 Comp.3
mod07_b2 -0.437211 0.830994 0.343941
mod07_b6 -0.721016 -0.095279 -0.686337
mod07_b7 -0.537572 -0.548061 0.640817
Comp.1 Comp.2 Comp.3
SS loadings 1.000000 1.000000 1.000000
Proportion Var 0.333333 0.333333 0.333333
Cumulative Var 0.333333 0.666667 1.000000
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
# 6
prcomp() on
# centered postfire modis bands 2, 6 and 7 with center=FALSE,
scale=FALSE
prcomp(mod07_b267_centered, center=FALSE, scale=FALSE) ## In this
example the data are already centered. This is the same as using
prcomp(x, center=TRUE, scale=FALSE) on raw data or i.pca on centered
data
Standard deviations:
[1] 882.1844 438.7430 108.9810
Rotation:
PC1 PC2 PC3
mod07_b2_centered 0.4372131 0.83099137 -0.3439448
mod07_b6_centered 0.7210164 -0.09527912 0.6863360
mod07_b7_centered 0.5375686 -0.54806499 -0.6408157
# 7
prcomp() on
# centered postfire modis bands 2, 6 and 7 with center=TRUE,
scale=FALSE
prcomp(mod07_b267_centered, center=TRUE, scale=FALSE) ## The parameter
"center=TRUE" has no effect in this example since the data used are
already centered
Standard deviations:
[1] 882.1814 438.7420 108.9792
Rotation:
PC1 PC2 PC3
mod07_b2_centered 0.4372107 0.83099407 -0.3439413
mod07_b6_centered 0.7210155 -0.09527872 0.6863371
mod07_b7_centered 0.5375718 -0.54806097 -0.6408165
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
### ### ###
Comments
Using already centered data does not make sense to use the parameter
"center=". Thus the only difference will be between _scaled_ and
_unscaled_ data prior to PCA.
### ### ###
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
# 8
i.pca on
# scaled postfire modis bands 2, 6 and 7
i.pca input=mod07_b2_scaled,mod07_b6_scaled,mod07_b7_scaled
out=pca_mod07_b267_scaled
# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 19.09 (-0.6824,-0.5773,-0.4484) [98.57%]
PC2 0.26 ( 0.6786,-0.2724,-0.6821) [ 1.36%]
PC3 0.01 ( 0.2716,-0.7698, 0.5777) [ 0.07%]
# 9
prcomp() on
# scaled postfire modis bands 2, 6 and 7 with center=FALSE, scale=TRUE
prcomp(mod07_b267, center=FALSE, scale=TRUE)
Standard deviations:
[1] 1.71888790 0.20789109 0.04696487
Rotation:
PC1 PC2 PC3
mod07_b2 -0.5747633 0.74019177 -0.3489459
mod07_b6 -0.5812894 -0.06916484 0.8107520
mod07_b7 -0.5759772 -0.66882910 -0.4700191
# get eigenvalues
summary(prcomp(mod07_b267, center=FALSE, scale=TRUE))
Importance of components:
PC1 PC2 PC3
...
Proportion of Variance 0.985 0.0144 0.00074
...
# 10
prcomp() on
# scaled postfire modis bands 2, 6 and 7
prcomp(mod07_b267_scaled, center=FALSE) ## This is the same as with
i.pca on manually scaled data
Standard deviations:
[1] 7.4449990 0.8749921 0.1925904
Rotation:
PC1 PC2 PC3
mod07_b2_scaled -0.6824314 0.6786305 -0.2715660
mod07_b6_scaled -0.5772537 -0.2724487 0.7697726
mod07_b7_scaled -0.4484034 -0.6820795 -0.5776695
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
# 11
i.pca on
# re-scaled to 0 ~ 255 postfire modis bands 2, 6 and 7
###
# first r.rescale to=0,255
# then r.mapcalc to ensure that 0,255 values
###
i.pca
input=mod07_b2_rescaled255,mod07_b6_rescaled255,mod07_b7_rescaled255
out=pca_mod07_b267_rescaled255
# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 15236.52 (-0.5972,-0.6037,-0.5282) [98.52%]
PC2 216.16 ( 0.7340,-0.1457,-0.6634) [ 1.40%]
PC3 13.01 ( 0.3235,-0.7838, 0.5301) [ 0.08%]
# 12
# re-scaled to 0 ~ 254 postfire modis bands 2, 6 and 7
input=mod07_b2_rescaled254,mod07_b6_rescaled254,mod07_b7_rescaled254
out=pca_mod07_b267_rescaled254
# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 15117.39 (-0.5972,-0.6037,-0.5282) [98.52%]
PC2 214.43 ( 0.7341,-0.1458,-0.6632) [ 1.40%]
PC3 12.90 ( 0.3234,-0.7838, 0.5302) [ 0.08%]
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
# 13
i.pca on
# centered and scaled postfire modis bands 2, 6 and 7
i.pca
input=mod07_b2_centered_scaled,mod07_b6_centered_scaled,mod07_b7_centered_scaled out=pca_mod07_b267_centered_scaled
## This is the same as performing prcomp(x, center=TRUE, scaled=TRUE)
# eigenvalues + eigenvectors
Eigen values, (vectors), and [percent importance]:
PC1 0.79 (-0.4907,-0.6521,-0.5779) [76.30%]
PC2 0.23 (-0.8120, 0.1015, 0.5748) [22.52%]
PC3 0.01 ( 0.3162,-0.7513, 0.5793) [ 1.18%]
# 14
prcomp() on
# center=TRUE, scale=TRUE
prcomp(mod07_b267, center=TRUE, scale=TRUE) ## This is the same as
performing i.pca on centered and scaled data
Standard deviations:
[1] 1.5137992 0.8210167 0.1853207
Rotation:
PC1 PC2 PC3
mod07_b2 0.4896414 0.8145817 -0.3109792
mod07_b6 0.6516766 -0.1049365 0.7512030
mod07_b7 0.5792831 -0.5704779 -0.5822250
# get eigenvalues
summary(prcomp(mod07_b267, center=TRUE, scale=TRUE))
Importance of components:
PC1 PC2 PC3
...
Proportion of Variance 0.764 0.225 0.0115
...
# 15
princomp() on
# raw postfire modis bands 2, 6 and 7 based on the covariance matrix
print(loadings(princomp(mod07_b267, cor=TRUE)), digits=6, cutoff=0.001)
## This is the same as performing prcomp(x, center=TRUE, scale=TRUE) or
i.pca on manually centered and scaled data
Loadings:
Comp.1 Comp.2 Comp.3
mod07_b2 -0.489641 0.814582 0.310979
mod07_b6 -0.651677 -0.104936 -0.751203
mod07_b7 -0.579283 -0.570478 0.582225
Comp.1 Comp.2 Comp.3
SS loadings 1.000000 1.000000 1.000000
Proportion Var 0.333333 0.333333 0.333333
Cumulative Var 0.333333 0.666667 1.000000
More information about the grass-stats
mailing list