[GRASS-user] i.pca vs. r.covar/m.eigensystem/r.mapcalc

Nikos Alexandris nikos.alexandris at felis.uni-freiburg.de
Fri Nov 7 14:38:21 EST 2008


On Fri, 2008-11-07 at 09:54 -0800, Dylan Beaudette wrote: 
> On Friday 07 November 2008, Nikos Alexandris wrote:
> > On Wed, 2008-11-05 at 07:06 -0800, Dylan Beaudette wrote:
[...] 
> > I've bombed in another wall :-(
> >
> > > x.nas <- which(is.na(x at data@MOD2007_242_500_sur_refl_b02) &
> >
> > is.na(x at data@MOD2007_242_500_sur_refl_b06) & is.na(x at data
> > $MOD2007_242_500_sur_refl_b07))
> > Error in which(is.na(x at data@MOD2007_242_500_sur_refl_b02) &
> > is.na(x at data@MOD2007_242_500_sur_refl_b06) &  :
> >   trying to get slot "MOD2007_242_500_sur_refl_b02" from an object
> > (class "data.frame") that is not an S4 object
[...] 
> You have a syntax error here:
> x at data@MOD2007_242_500_sur_refl_b06
>           ^^^^
> 
> should be:
> x at data$MOD2007_242_500_sur_refl_b06
> 
> subtle but important. 
> 
> $ is used to access columns from a dataframe
> @ is used to access slots
> 
> Cheers,
> Dylan

Dylan, thank you. It works now... sort of!? Sorry for another long post.
Perhaps I should continue this thread in another ML.

#read raster maps
x <-
readRAST6(c('MOD2007_242_500_sur_refl_b02','MOD2007_242_500_sur_refl_b06','MOD2007_242_500_sur_refl_b07'))

#extract NAs
x.nas <- which(is.na(x at data$MOD2007_242_500_sur_refl_b02) & is.na(x at data
$MOD2007_242_500_sur_refl_b06) & is.na(x at data
$MOD2007_242_500_sur_refl_b07))

#vectors pointing to no-NAs
x.vals <- which( !is.na(x at data$MOD2007_242_500_sur_refl_b02) & !
is.na(x at data$MOD2007_242_500_sur_refl_b06) & !is.na(x at data
$MOD2007_242_500_sur_refl_b07))

#get no-NAs
x.no.na <- x at data[x.vals, ]

#pca
x.pca <- prcomp(x.no.na)

# output
x.pca

Standard deviations:
[1] 881.8701 438.7193 108.9755

Rotation:
                                   PC1         PC2        PC3
MOD2007_242_500_sur_refl_b02 0.4371624  0.83100302 -0.3439811
MOD2007_242_500_sur_refl_b06 0.7210191 -0.09520195  0.6863440
MOD2007_242_500_sur_refl_b07 0.5376063 -0.54806074 -0.6407877

# probably the "correct" function to use (and compare) is the princomp
and not prcomp
## read [1] [2] for details

# princomp on x.no.na ## now I am stuck here :-)
x.pca_corr <- princomp(x.no.na, cor = TRUE)

# x.pca_corr
Call:
princomp(x = x.no.na, cor = TRUE)

Standard deviations:
   Comp.1    Comp.2    Comp.3 
1.5137012 0.8211861 0.1853699 

 3  variables and  88101 observations.

# or "cor = FALSE"
x.pca_cova <- princomp(x.no.na, cor = FALSE)

# x.pca_cova
Call:
princomp(x = x.no.na, cor = FALSE)

Standard deviations:
  Comp.1   Comp.2   Comp.3 
881.8651 438.7169 108.9748 

 3  variables and  88101 observations.


# I would like to compare R's results with i.pca's but I don't know why
or how I can control the "print-out" of the eigenvalues

## If I get the eigenvalues I would...
#compare with i.pca reported eigenvalues## read first mails on this
thread :-)

Eigen values:
( -0.64 -0.65 -0.42 )
( -0.71 0.28 0.64 )
( -0.30 0.71 -0.64 )

My understanding is that i.pca reports in rows, so in mt exampl it is:
pc1 ( -0.64 -0.65 -0.42 )
pc2 ( -0.71 0.28 0.64 )
pc3 ( -0.30 0.71 -0.64 )


Some interesting stuff to check:

1. prcomp and princomp accept an argument "na.action"

a function which indicates what should happen when the data contain
'NA's.
The default is set by the 'na.action' setting of 'options', and is
'na.fail' if that
is unset. The 'factory-fresh' default is 'na.omit'.

## question: (something like na.action = na.omit would help?)

---


[1] taken from "?prcomp":
The calculation is done by a singular value decomposition of the
(centered and possibly scaled) data matrix, not by using 'eigen'
on the covariance matrix.  This is generally the preferred method
for numerical accuracy.  The 'print' method for these objects
prints the results in a nice format and the 'plot' method produces
a scree plot.

[2] "?princomp":
The calculation is done using 'eigen' on the correlation or
covariance matrix, as determined by 'cor'.  This is done for
compatibility with the S-PLUS result.  A preferred method of
calculation is to use 'svd' on 'x', as is done in 'prcomp'.




More information about the grass-user mailing list