[GRASS-dev] PCA question

Nikos Alexandris nik at nikosalexandris.net
Tue Jun 26 06:31:17 PDT 2012


 Nikos:

> > Which resolution is to be enhanced?  The geometric?  Is it meant
> > to keep PC1 and mix it with the rest, or keep the Pan and throw
> > away PC1?

> > Principal Component 1 will contain the highest variance of
> > your input data -- which, in fact, is a composition of different
> > amount of information originated from all input bands. If you
> > throw that away you are left with a dataset which is likely to
> > be useless!

Hamish:

> (not talking about pan-sharpening, but in general,)
> how about the situation where you have a map data

( we are talking about multi-dimensional data, right? )

> which is loudly dominated by a signal, and you want to try and remove that
> loud signal so that you can look at the subtle variations caused by a
> different source that the loud signal had been masking?

Yes, this _can_ be a perfect use-case.

Especially if the presence of the feature in question, is in at least one or 
in some of the input dimensions near/close to zero. This last statement is 
based on Pielou's (flawless explanation of how PCA works) [1] and own 
experiences [2].

A separation/isolation attempt of the feature in question from dominant 
variances will be "supported".  The "loud" signal would be channeled among the 
first few PCs and the "subtle variations" _could_ then be more evident in some 
of the higher order components.

All in all, one has to look at the numbers -- drawing conclusions from the PC 
images is not safe!
 
> is removing PC1 then back-inverting a suitable method for that sort of task? 

Short answer: yes, it can be, but back-inverting might not be necessary!

Longer story:  if the "subtle variations" (featueres of low(er) variance, 
rather homogenous stuff) are, as expected, more evident (read: enhanced as 
compared to the original data set) in some of the higher order components, why 
bother to back-invert?  Supervised classification techniques can directly 
operate on selected PCs and attempt to extract whatever is of your interest.

More on the subject of back-inverting -- quotting from Dr. Koutsias paper:

"A critical issue in the back-transformation process is the
amount of information taken from each PC axis. The original
spectral pattern of the satellite image is modified to a degree that
depends on the amount of the information taken from each PC axis."

In this work (mapping burned areas), "back-transformation coefficients", in the 
range of 0 to 1, were worked-out in order to 'grep' specific percentages (0 to 
100%) from each of the produced PCs and channel them back (via inverse-PCA) to 
a data set _similar_ to the original one, though different to the extent of the 
removed information (excluded PC).

> or is there another more suitable method?

Dunno more... :-(
Kindest regards, Nikos

---
[1] Book: Pielou, E. C. The interpretation of ecological data: a primer on 
classification and ordination Wiley, New York, 1984

[2] <Dissertation: Burned area mapping via non-centered PCA using Public 
Domain Data and Free Open Source Software Institut für Forstökonomie, Fakultät 
für Forst- und Umweltwissenschaften, Albert-Ludwigs-Universität Freiburg, 
2011>

[3] <Koutsias, N.; Mallinis, G. & Karteris, M. A forward/backward principal 
component analysis of Landsat-7 ETM+ data to enhance the spectral signal of 
burnt surfaces ISPRS Journal of Photogrammetry and Remote Sensing, 2009, 64, 
37>


More information about the grass-dev mailing list