[GRASS-dev] PCA (i.pca) in G7: filtering and rescaling

Nikos Alexandris nik at nikosalexandris.net
Thu Dec 5 04:15:03 PST 2013


Nikos Alexandris:
> > ...we need those extra digits to make it easy rejecting last Principal
> > Component(s) prior to the backward PCA. Might be one, two or numerous (?)
> > depending on the dimensions.

Markus M:
> I think it rather depends in the amount of information encoded in each PC.

It does. PCA works on global stats so one has to go through, then study 
visuals and numbers, then decide what to keep or how to treat further.

In my very simple example, I want to see whether I want to reject the last or 
the two last ones.  If the filtering option lets me do that, I am happy :-).  
To exemplify, currently I can't reject two last components whicih account for 
0.06 and 0.21 of the original data variance. I tested yesterday and the filter 
does not differentiate those subtle details which might be of importance (for 
a subsequent classification of High-Res images).

Will test Moritz' diff.  Thank you for that one.
 
> Alternatively, PC selection could also be based on the Eigenvalue,
> typically all PCs with an Eigenvalue >= 1 (centered and scaled input)
> would be used.

It depends. Typically might be simply compressing data or reducing salt 'n' 
pepper. However, in change detection studies, where changes are likely to 
appear in higher order components, it's not uncommon to have several 
components which account for <= 1 of the original variance and still are the 
ones you really need.

In fact I would like to have fine-control over centering as well . My 
"feelings" about scaling are less enthousiastic. Especially for change 
detection studies, subtle (though important) changes might disappear after 
scaling between dimensions which already measure the same thing (same units, 
no huge range of values differences) from a different "perspective". It always 
depends what the aim is of course.

Nikos


More information about the grass-dev mailing list