[GRASS-user] Calculating eigen values and % varianceexplainedafter PCA analysis

Sun Mar 1 08:13:07 EST 2009

Nikos:
> >    * Present first the variance (=eigenvalues) because it's the
> >first thing you will look at to know "how much variance of the
> >original data is _expressed_ in each new component.
> >    * The importance, since it refers to the eigenvalue, it's better
> >to come right after it.

Hamish:
> to me it picks your eye more quickly if it is not buried in the
> middle.
> shrug. the important thing is that the numbers are correct & not
> confusing.

Yes, the important thing are the numbers. A clear output is also more
"functional", if I may say so.

> >    * Present the loadings (eigenvectors) for each new
> > component.

> we are doing that already, right?

Absolutely. It's just the structure of the output what remains. I have
no objections to whatever will be decided as long as the _numbers_ are
there. Nonetheless, from a user's perspective, I presented my ideas
about the output.

> >    * Column-wise or row-wise? The results can be either
> > presented column-wise, that is one column for each new component 
> > _or_  row-wise, as they are currently printed. I think row-wise just
> > looks better :-)

> maybe, but row-wise is slightly easier to code.

For the interpretation I think the way that the output will look like,
is just a matter to get used to it. In fact, I think row-wise is easier
than column-wise. Anyway, this is of minor importance.

> > "Some" examples... (only 2 for column-wise and
> > all the rest row-wise... playing around).

> fancy tables are hard for the module output because it uses
> G_message()
> and G_message() condenses any whitespace (multiple spaces, tabs,..) to
> a
> single space. thus formatting is lost.
> 
> and i.pca's main output is maps, not eigen data so I guess it makes
> sense
> to keep that text optional instead of sending to stdout. Perhaps a New
> flag to print summary report to stdout? (mmph, just cut&paste from
> history)

> for map history it's a bit better, but I can't end with a %.

> now 'r.info -h' output looks like:
> 
>    Eigen values, (vectors), and [percent importance]:
>    PC1   1170.12 ( -0.63 -0.65 -0.43 ) [ 88.07% ]
>    PC2    152.49 (  0.23  0.37 -0.90 ) [ 11.48% ]
>    PC3      6.01 (  0.75 -0.66 -0.08 ) [  0.45% ]
> 
> 
> module output is same but not as pretty due to G_message() issue.

Well, whatever is practical and achievable. I would be happy to see more
suggestions upon this. But I like it now that it prints out the
eigenvalues.

Best regards, Nikos