[GRASS-dev] PCA question

Mon Jun 25 12:12:32 PDT 2012

Great news. Thanks.

Michael

On Jun 25, 2012, at 12:51 PM, Markus Metz wrote:

> On Mon, Jun 25, 2012 at 8:21 PM, Michael Barton <Michael.Barton at asu.edu> wrote:
>> On Jun 25, 2012, at 12:03 PM, Markus Metz wrote:
>>> 
>>> Here is an example based on on the landsat imagery in mapset landsat,
>>> North Carolina sample dataset:
>>> 
>>> i.pca without rescaling of the output:
>>> i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
>>> output_prefix=lsat7_2000_pc rescale=0,0
>>> 
>>> The eigenvectors are
>>> 0.4494, 0.5128, 0.7315
>>> 0.6726, 0.3447,-0.6548
>>> 0.5879,-0.7863, 0.1901
>>> 
>>> this is a square 3x3 matrix. R gives the inverse matrix as
>>> 
>>> 0.4493372, 0.6726549, 0.5879235
>>> 0.5128129, 0.3446144,-0.7862660
>>> 0.7315070,-0.6548316, 0.1899992
>>> 
> [...]
>>> 
>>> Without using R, the inverse of a 3x3 matrix can be manually
>>> calculated as follows:
>>> 
>>> original matrix:
>>> a b c
>>> d e f
>>> g h k
>>> 
>>> inverse matrix:
>>> (ek - fh)   (ch - bk)   (bf - ce)
>>> (fg - dk)   (ak - cg)   (cd - af)
>>> (dh - eg)  (gb - ah)   (ae - bd)
>> 
>> Your example from R above *looks* like the inverse of matrix...
>> 
>> a b c
>> d e f
>> g h k
>> 
>> is (given rounding errors) ...
>> 
>> a d g
>> b e h
>> c f k
>> 
>> That is, transpose rows and columns.
>> Is this just a coincidence or a reasonable approximation?
> 
> Oops, right. A PCA uses an orthogonal transformation, in which case
> the inverse of the transformation matrix is identical to the transpose
> of the transformation matrix. Easy solution.
>> 
>>> 
>>> each entry in this inverse matrix needs to be multiplied with
>>> 
>>> 1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))
>> 
>> by a(ek - fh)... , you mean a = factor score 1 of a cell, b = factor score 2 of a cell, c = factor score 3 of a cell (not eigenvectors a,b,c or the matrix), right?
> 
> No, a,b,c are here Eigenvector values. But this formula is the general
> case for the inverse of a 3x3 matrix and not needed here, because the
> inverse and the transpose are the same.
> 
> Markus M
> 
>> 
>>> 
>>> in order to get the transformation matrix to be applied to the pc scores.
>>> 
>>> The formula is more complex for matrices with more than 3 dimensions,
>>> e.g. for i.pca with 6 input bands.
>>> 
>>>> 
>>>> So my questions are:
>>>> 
>>>> 1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?
>>> 
>>> I don't think so. Using your formula and testing for the difference gives me
>>> min=-546.255208609669
>>> max=174.771853180844
>>> range=721.027061790513
>>> mean=79.9249815357317
>>> 
>>> a bit too much IMHO.
>> 
>> Right. Looks like I'll need to use r.univar to calculate the mean of each band to add back in. Thanks again. I will try this out manually and test before updating the pan sharpening routine.
>> 
>>> 
>>>> 2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.
>>> 
>>> The mean of each band is added back to each recalculated band. See
>>> above r.mapcalc formula for band 10.
>>> 
>>>> 3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?
>>> 
>>> Normalization applies to the input of i.pca and is by default not
>>> done. It needs to be done for heterogenous input data such as e.g.
>>> rainfall, temperature, NDVI, etc. Rescaling is automatically applied
>>> to the output of i.pca unless explicitly disabled with rescale=0,0.
>> 
>> Thanks. That is what I thought.
>> 
>> Michael
>> 
>>> 
>>> Markus M
>>> 
>>>> 
>>>> Thanks much
>>>> Michael
>>>> 
>>>> 
>>>> On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:
>>>> 
>>>>> An inverse PCA can be regarded as the inverse of a transformation
>>>>> using matrix notation. PC scores are calculated with
>>>>> 
>>>>> b = A a
>>>>> 
>>>>> with A being the transformation matrix composed of the Eigenvectors, a
>>>>> being the vector of the original values and b the PC scores. What you
>>>>> now need is inverse of A, A^-1. The original values can then be
>>>>> retrieved with
>>>>> 
>>>>> A^-1 b = a
>>>>> 
>>>>> A^-1 is the inverse of the transformation matrix A which you can get
>>>>> in R with solve(A).
>>>>> 
>>>>> For a PCA, the original values are usually shifted to the mean and
>>>>> optionally scaled to stddev before computing the Eigenvectors. The
>>>>> mean shift is always performed by i.pca, scaling is optional. That
>>>>> means that A^-1 b gives the original values shifted to the mean and
>>>>> maybe scaled, and the mean of each original band needs to be added to
>>>>> get the original values used as input to i.pca. With scaling applied,
>>>>> the shifted values need to be multiplied by the stddev for each
>>>>> original band.
>>>>> 
>>>>> HTH,
>>>>> 
>>>>> Markus M
>>>>> 
>>>>> 
>>>>> On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton at asu.edu> wrote:
>>>>>> The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.
>>>>>> 
>>>>>> However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> 
>>>>>> On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:
>>>>>> 
>>>>>>> Dear all,
>>>>>>> first, sorry for the delay...
>>>>>>> Here are my 2 cents to be added to the discussion. I re-took in my
>>>>>>> hands the John Jensen book.
>>>>>>> Accordingly
>>>>>>> 
>>>>>>> new brightness values1,1,1 = a1,1*BV1,1,1  +a2,1*BV1,1,2..... + an,1*BV1,1,m
>>>>>>> 
>>>>>>> where a=eigenvector and BV=original brightness value.
>>>>>>> 
>>>>>>> I found no evidence for the mean term: "- ((b1+b2+b3)/3)"
>>>>>>> 
>>>>>>> Michael: do you have a proof/reference for that?
>>>>>>> 
>>>>>>> P.S. thanks for involving me in this discussion which is really stimulating!
>>>>>>> 
>>>>>>> Duccio
>>>>>>> 
>>>>>>> 2012/6/7 Michael Barton <michael.barton at asu.edu>:
>>>>>>>> 
>>>>>>>> I think I've figured it out.
>>>>>>>> 
>>>>>>>> If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:
>>>>>>>> 
>>>>>>>> fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
>>>>>>>> fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
>>>>>>>> fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)
>>>>>>>> 
>>>>>>>> So to do an inverse PCA on the same data you need to do the following:
>>>>>>>> 
>>>>>>>> b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
>>>>>>>> b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
>>>>>>>> b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)
>>>>>>>> 
>>>>>>>> Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.
>>>>>>>> 
>>>>>>>> Michael
>>>>>>>> 
>>>>>>>> On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:
>>>>>>>> 
>>>>>>>>> Hi Duccio,
>>>>>>>>> 
>>>>>>>>> On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton at asu.edu> wrote:
>>>>>>>>>> On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:
>>>>>>>>>>> On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton at asu.edu> wrote:
>>>>>>>>> ...
>>>>>>>>>>>> I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.
>>>>>>>>>>> 
>>>>>>>>>>> Maybe there is material in (see m.eigenvector)
>>>>>>>>>>> http://grass.osgeo.org/wiki/Principal_Components_Analysis
>>>>>>>>>> 
>>>>>>>>>> This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.
>>>>>>>>> 
>>>>>>>>> @Duccio: any idea about this by chance?
>>>>>>>>> 
>>>>>>>>> thanks
>>>>>>>>> Markus
>>>>>>>> 
>>>>>>>> _____________________
>>>>>>>> C. Michael Barton
>>>>>>>> Visiting Scientist, Integrated Science Program
>>>>>>>> National Center for Atmospheric Research &
>>>>>>>> University Corporation for Atmospheric Research
>>>>>>>> 303-497-2889 (voice)
>>>>>>>> 
>>>>>>>> Director, Center for Social Dynamics & Complexity
>>>>>>>> Professor of Anthropology, School of Human Evolution & Social Change
>>>>>>>> Arizona State University
>>>>>>>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Duccio Rocchini, PhD
>>>>>>> 
>>>>>>> http://gis.cri.fmach.it/rocchini/
>>>>>>> 
>>>>>>> Fondazione Edmund Mach
>>>>>>> Research and Innovation Centre
>>>>>>> Department of Biodiversity and Molecular Ecology
>>>>>>> GIS and Remote Sensing Unit
>>>>>>> Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
>>>>>>> Phone +39 0461 615 570
>>>>>>> ducciorocchini at gmail.com
>>>>>>> duccio.rocchini at fmach.it
>>>>>>> skype: duccio.rocchini
>>>>>> 
>>>>>> _____________________
>>>>>> C. Michael Barton
>>>>>> Visiting Scientist, Integrated Science Program
>>>>>> National Center for Atmospheric Research &
>>>>>> University Consortium for Atmospheric Research
>>>>>> 303-497-2889 (voice)
>>>>>> 
>>>>>> Director, Center for Social Dynamics & Complexity
>>>>>> Professor of Anthropology, School of Human Evolution & Social Change
>>>>>> Arizona State University
>>>>>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> grass-dev mailing list
>>>>>> grass-dev at lists.osgeo.org
>>>>>> http://lists.osgeo.org/mailman/listinfo/grass-dev
>>>> 
>>>> _____________________
>>>> C. Michael Barton
>>>> Visiting Scientist, Integrated Science Program
>>>> National Center for Atmospheric Research &
>>>> University Corporation for Atmospheric Research
>>>> 303-497-2889 (voice)
>>>> 
>>>> Director, Center for Social Dynamics & Complexity
>>>> Professor of Anthropology, School of Human Evolution & Social Change
>>>> Arizona State University
>>>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>>>> 
>> 
>> _____________________
>> C. Michael Barton
>> Visiting Scientist, Integrated Science Program
>> National Center for Atmospheric Research &
>> University Corporation for Atmospheric Research
>> 303-497-2889 (voice)
>> 
>> Director, Center for Social Dynamics & Complexity
>> Professor of Anthropology, School of Human Evolution & Social Change
>> Arizona State University
>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>> 

_____________________
C. Michael Barton
Visiting Scientist, Integrated Science Program
National Center for Atmospheric Research &
University Corporation for Atmospheric Research
303-497-2889 (voice)

Director, Center for Social Dynamics & Complexity 
Professor of Anthropology, School of Human Evolution & Social Change
Arizona State University
www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu