[GRASS-user] Calculating eigen values and % varianceexplainedafter PCA analysis

Fri Feb 27 06:29:53 EST 2009

Wesley
> I downloaded and installed GRASS 6.4 and after much "wailing and
> gnashing of teeth" I got m.eigensystem to work. Below are some
> comments and questions.

Nice that it worked-out finally. Hopefully my comments are useful for
you (and correct). You can have a look in the following links
[1][2][3][4].

> Over the last couple of days I have been running PCA analyses using
> the i.pca and r.covar -> m.eigensystem -> r.mapcalc. The analysis
> seeks to create a component surface where tree crowns are separated
> from understory and ground in a plantation forest. Inputs are three
> digital aerial photographs (red, green, blue), a top of canopy height
> model, and an intensity surface derived from lidar return intensity
> measures. Output from the PCA will be input into a tree couting method
> which (if all goes well) will use mathematical morphology to isolate
> tree crowns for counting purposes

Interesting stuff!

> My results are interesting and worth mentioning to the list. Firstly,
> the results from both the automated (i.pca) and the
> 'by-hand-method' (r.covar -> m.eigensystem -> r.mapcalc) differ. For
> example; the eigen values from the automated approach are as follows

> (-0.50 -0.53 -0.49 -0.47 -0.08)
> (-0.38 -0.30 -0.13 0.86 0.11)
> (-0.34 -0.35 0.86 -0.14 0.05)
> (0.70 -0.71 -0.01 0.06 0.03)
> (0.00 -0.03 0.07 0.13 -0.99)

> while the eigen values from the 'by-hand-method' are completely
> different, in fact I am a little confused with regards to the ouput
> from i.pca and the m.eigensystem. i.pca returns the n number of
> components plus the eigen values for each component (or are those
> vectors?).

Yes, those are the eigen_VECTORS_(=loadings, on other words the amount
of information that contribute each of the original dimensions in the
resulting components). Each row corresponds to one principal components.
In your example above you "know" that the 1st component (1st row) is
composed by the original dimensions (each column) and each original
dimension has "contributed" according to the _loadings_:

So dimenions 1 -> -0.50, dimension 2 -> -0.53 , dimension 3 -> -0.49,
dimension 4 -> -0.47 and dimension 5 ->  -0.08

If I understand well the PCA myself, you can disregard the "signs" and
see the loadings as absolute values.

> Would it be fair in saying that these are the coefficients which have
> been applied to the input imagery to attain the output components (in
> the same way the m.eigensystem works with r.mapcalc)?

Yes.

> Output from the m.eigensystem approach only gives one eigen value per
> component (see below).
> Are the above values from i.pca not the eigen vectors?

It should be the case with i.pca as well since eigen_VALUES_ (=represent
the variances of the original dimensions that are "kept" in each
component) are important for the interpretation of what exactly are each
of the components. But, i.pca just does not report the eigen_VALUES_.

At some point some C-expert needs to have a look in the code (i.pca) and
correct the "bug" which does not let the eigen_VALUES_ from being
printed.

>  If this is the case then both methods still differ significantly. Is
> this possible, and which should I use.

Please have a look at my comments/questions in link [2]. i.pca follows
the "SVD" method. You performed the non-standartised PCA using the
covariance matrix. Note that you can use also the standartised method by
using the correlation matrix. 

> Qualitatively, the 'by-hand-method' seems to isolate the crowns very
> nicely in PC1 while the automated (i.pca) approach isolates crowns in
> PC3?? I rescaled the output in the i.pca method, would this contribute
> to the differences seen?
> 
> I am going to run more tests on the rest of my data and will see if
> these issues arise again. In the meantime if anyone of the list can
> offer some insight into the two different pca analysis examples I
> would greatly appreciate it.

I would be happy to hear more. It's a tool I also need.
Kindest regards, Nikos

[...]

---
Links:

# in grass-user mailing list

[1] # In these posts I didn't know much about PCA #
http://n2.nabble.com/i.pca--vs.--r.covar-m.eigensystem-r.mapcalc-td1885820.html#a1885821

[2] # this is the one I have sent you already #
http://n2.nabble.com/Comparison-between-"i.pca"-and-R's-"prcomp()"%
3A-explanations-and-questions-td2283997.html#a2284070

# in grass-trac

[3] http://trac.osgeo.org/grass/ticket/341

[4] http://trac.osgeo.org/grass/ticket/430