[GRASS-stats] Re: [GRASS-user] Testing i.pca ~ prcomp(), m.eigensystem ~ princomp()

Wed Apr 1 12:57:05 EDT 2009

Edzer Pebesma wrote:
> Markus, a few notes:
>
> - if you do PCA on uncentered data, by computing the eigenvalues of the
> uncentered covariance matrix, this implies that bands with a larger mean
> will get more influence on the final PCAs. I have sofar not managed
> finding an argument why this would be desirable.
>   
Add it to wiki? E.g. bands entered in a PCA should have the same mean, 
but normalization is also an option.
> - if you do PCA on (band-mean)/sd(band), it means that you first
> normalize (scale) 
I think scale and normalize are two different things.
> each variable to mean zero and unit variance. This
> procedure is identical to doing PCA on the correlation matrix. It means
> that, unlike for unscaled variables, variables with larger variance will
> not get more influence on the PCA than others. For image analysis I can
> see a place for both; if bands with low variance indicate insignificant
> and perhaps noisy information, you may downweight them. 
Variance is dependent on range, I would rather use something like 
coefficient of variation (stddev/mean) to get some scale-independent 
indicator on the amount of information in a given band. A downscaled 
band (e.g. MODIS scale of 0.0001) has still the same information but 
lower variance.
> - Only in case of normalized variables, or equivalently PCA on
> correlations, it makes sense to select PC's with an eigenvalue larger
> than 1. The reasoning is fairly weak, but goes like this: if a PC has
> eigenvalue > 1, it explains more variance than any of the original
> variables, which all have variance 1.
>   
Sounds good to me, why should I use a component that explains less than 
any of the original bands? And the whole purpose of a PCA is variable 
reduction to get a new set of variables, each explaining the whole 
dataset better than one of the original variables/bands. A PCA produces 
as many components as input variables, so some selection is usually 
necessary for further processing, could also be % explained variance. 
OTOH, sometimes only the first component is of interest. There may be 
exceptions for imagery processing, e.g. haze reduction (would have to 
read up on imagery processing too to say anything more about where 
components with eigenvalue < 1 could be useful).