[GRASS-dev] PCA question

Mon Jun 25 11:51:41 PDT 2012

On Mon, Jun 25, 2012 at 8:21 PM, Michael Barton <Michael.Barton at asu.edu> wrote:
> On Jun 25, 2012, at 12:03 PM, Markus Metz wrote:
>>
>> Here is an example based on on the landsat imagery in mapset landsat,
>> North Carolina sample dataset:
>>
>> i.pca without rescaling of the output:
>> i.pca input=lsat7_2000_10,lsat7_2000_20,lsat7_2000_30
>> output_prefix=lsat7_2000_pc rescale=0,0
>>
>> The eigenvectors are
>> 0.4494, 0.5128, 0.7315
>> 0.6726, 0.3447,-0.6548
>> 0.5879,-0.7863, 0.1901
>>
>> this is a square 3x3 matrix. R gives the inverse matrix as
>>
>> 0.4493372, 0.6726549, 0.5879235
>> 0.5128129, 0.3446144,-0.7862660
>> 0.7315070,-0.6548316, 0.1899992
>>
[...]
>>
>> Without using R, the inverse of a 3x3 matrix can be manually
>> calculated as follows:
>>
>> original matrix:
>> a b c
>> d e f
>> g h k
>>
>> inverse matrix:
>> (ek - fh)   (ch - bk)   (bf - ce)
>> (fg - dk)   (ak - cg)   (cd - af)
>> (dh - eg)  (gb - ah)   (ae - bd)
>
> Your example from R above *looks* like the inverse of matrix...
>
> a b c
> d e f
> g h k
>
> is (given rounding errors) ...
>
> a d g
> b e h
> c f k
>
> That is, transpose rows and columns.
> Is this just a coincidence or a reasonable approximation?

Oops, right. A PCA uses an orthogonal transformation, in which case
the inverse of the transformation matrix is identical to the transpose
of the transformation matrix. Easy solution.
>
>>
>> each entry in this inverse matrix needs to be multiplied with
>>
>> 1 / (a(ek - fh) - b(kd - fg) + c(dh - eg))
>
> by a(ek - fh)... , you mean a = factor score 1 of a cell, b = factor score 2 of a cell, c = factor score 3 of a cell (not eigenvectors a,b,c or the matrix), right?

No, a,b,c are here Eigenvector values. But this formula is the general
case for the inverse of a 3x3 matrix and not needed here, because the
inverse and the transpose are the same.

Markus M

>
>>
>> in order to get the transformation matrix to be applied to the pc scores.
>>
>> The formula is more complex for matrices with more than 3 dimensions,
>> e.g. for i.pca with 6 input bands.
>>
>>>
>>> So my questions are:
>>>
>>> 1) Do I have the inversion correct in terms of how it needs to be calculated in GRASS?
>>
>> I don't think so. Using your formula and testing for the difference gives me
>> min=-546.255208609669
>> max=174.771853180844
>> range=721.027061790513
>> mean=79.9249815357317
>>
>> a bit too much IMHO.
>
> Right. Looks like I'll need to use r.univar to calculate the mean of each band to add back in. Thanks again. I will try this out manually and test before updating the pan sharpening routine.
>
>>
>>> 2) Even though the mean of all bands seems to be subtracted from each factor score, does inverting the matrix mean that the mean of *each* band is added back to its transformed value? Adding either the mean of all original bands or 1 original band seems to produce values that are much higher than the original, and so need to be rescaled. Maybe this is OK.
>>
>> The mean of each band is added back to each recalculated band. See
>> above r.mapcalc formula for band 10.
>>
>>> 3) I do not normalize or rescale in i.pca. This seems to make it easier to do the inverse PCA with fewer calculations. Is there any reason I *should* rescale and/or normalize?
>>
>> Normalization applies to the input of i.pca and is by default not
>> done. It needs to be done for heterogenous input data such as e.g.
>> rainfall, temperature, NDVI, etc. Rescaling is automatically applied
>> to the output of i.pca unless explicitly disabled with rescale=0,0.
>
> Thanks. That is what I thought.
>
> Michael
>
>>
>> Markus M
>>
>>>
>>> Thanks much
>>> Michael
>>>
>>>
>>> On Jun 25, 2012, at 10:11 AM, Markus Metz wrote:
>>>
>>>> An inverse PCA can be regarded as the inverse of a transformation
>>>> using matrix notation. PC scores are calculated with
>>>>
>>>> b = A a
>>>>
>>>> with A being the transformation matrix composed of the Eigenvectors, a
>>>> being the vector of the original values and b the PC scores. What you
>>>> now need is inverse of A, A^-1. The original values can then be
>>>> retrieved with
>>>>
>>>> A^-1 b = a
>>>>
>>>> A^-1 is the inverse of the transformation matrix A which you can get
>>>> in R with solve(A).
>>>>
>>>> For a PCA, the original values are usually shifted to the mean and
>>>> optionally scaled to stddev before computing the Eigenvectors. The
>>>> mean shift is always performed by i.pca, scaling is optional. That
>>>> means that A^-1 b gives the original values shifted to the mean and
>>>> maybe scaled, and the mean of each original band needs to be added to
>>>> get the original values used as input to i.pca. With scaling applied,
>>>> the shifted values need to be multiplied by the stddev for each
>>>> original band.
>>>>
>>>> HTH,
>>>>
>>>> Markus M
>>>>
>>>>
>>>> On Tue, Jun 19, 2012 at 12:46 AM, Michael Barton <Michael.Barton at asu.edu> wrote:
>>>>> The constant (i.e., the band mean) was in the pca primer that someone sent me a link to in this discussion. Using the Eigenvectors resulting from i.pca, I can achieve the results of i.pca using my formula below. This is essentially the same as your formula minus the constant--which doesn't really make much (of any) difference in the final result.
>>>>>
>>>>> However, my question is about performing an *inverse pca*--getting back to the original values from the principal components maps. The idea of pca sharpening is that you perform a pca, then do an inverse pca substituting the pan band for pc1 to enhance the resolution. The equations I show below seem to work, but what I've read indicates that it is not possible to *exactly* get the original values back; you can only approximate them.
>>>>>
>>>>> Michael
>>>>>
>>>>>
>>>>> On Jun 17, 2012, at 10:48 AM, Duccio Rocchini wrote:
>>>>>
>>>>>> Dear all,
>>>>>> first, sorry for the delay...
>>>>>> Here are my 2 cents to be added to the discussion. I re-took in my
>>>>>> hands the John Jensen book.
>>>>>> Accordingly
>>>>>>
>>>>>> new brightness values1,1,1 = a1,1*BV1,1,1  +a2,1*BV1,1,2..... + an,1*BV1,1,m
>>>>>>
>>>>>> where a=eigenvector and BV=original brightness value.
>>>>>>
>>>>>> I found no evidence for the mean term: "- ((b1+b2+b3)/3)"
>>>>>>
>>>>>> Michael: do you have a proof/reference for that?
>>>>>>
>>>>>> P.S. thanks for involving me in this discussion which is really stimulating!
>>>>>>
>>>>>> Duccio
>>>>>>
>>>>>> 2012/6/7 Michael Barton <michael.barton at asu.edu>:
>>>>>>>
>>>>>>> I think I've figured it out.
>>>>>>>
>>>>>>> If (ev1-1, ev1-2, ev1-3) are the eigenvectors of the first principal component for 3 imagery bands (b1, b2, b3), the corresponding factor scores of the PC1, PC2, and PC3 maps (fs1, fs2, fs3) are calculated as:
>>>>>>>
>>>>>>> fs1 = (ev1-1*b1) + (ev1-2*b2) + (ev1-3*b3) - ((b1+b2+b3)/3)
>>>>>>> fs2 = (ev2-1*b1) + (ev2-2*b2) + (ev2-3*b3) - ((b1+b2+b3)/3)
>>>>>>> fs3 = (ev3-1*b1) + (ev3-2*b2) + (ev3-3*b3) - ((b1+b2+b3)/3)
>>>>>>>
>>>>>>> So to do an inverse PCA on the same data you need to do the following:
>>>>>>>
>>>>>>> b1' = (fs1/ev1-1) + (fs2/ev2-1) + (fs3/ev3-1)
>>>>>>> b2' = (fs1/ev1-2) + (fs2/ev2-2) + (fs3/ev3-2)
>>>>>>> b3' = (fs1/ev1-3) + (fs2/ev2-3) + (fs3/ev3-3)
>>>>>>>
>>>>>>> Adding the constant back on doesn't really seem to matter because you need to rescale b1' to b1, b2' to b2, and b3' to b3 anyway.
>>>>>>>
>>>>>>> Michael
>>>>>>>
>>>>>>> On Jun 7, 2012, at 1:55 AM, Markus Neteler wrote:
>>>>>>>
>>>>>>>> Hi Duccio,
>>>>>>>>
>>>>>>>> On Wed, Jun 6, 2012 at 11:39 PM, Michael Barton <michael.barton at asu.edu> wrote:
>>>>>>>>> On Jun 6, 2012, at 2:20 PM, Markus Neteler wrote:
>>>>>>>>>> On Wed, Jun 6, 2012 at 5:09 PM, Michael Barton <michael.barton at asu.edu> wrote:
>>>>>>>> ...
>>>>>>>>>>> I'm working on a pan sharpening script that will combine your i.fusion.brovey with options to do other pan sharpening methods. An IHS transformation is easy. I think that a PCA sharpening is doable too if I can find an equation to rotate the PCA channels back into unrotated space--since i.pca does provide the eigenvectors.
>>>>>>>>>>
>>>>>>>>>> Maybe there is material in (see m.eigenvector)
>>>>>>>>>> http://grass.osgeo.org/wiki/Principal_Components_Analysis
>>>>>>>>>
>>>>>>>>> This has a lot of good information and ALMOST has what I need. Everything I read suggests that there is a straightforward way to get the original values from the factor scores if you have the eigenvectors. But no one I've yet found provides the equation or algorithm to do it.
>>>>>>>>
>>>>>>>> @Duccio: any idea about this by chance?
>>>>>>>>
>>>>>>>> thanks
>>>>>>>> Markus
>>>>>>>
>>>>>>> _____________________
>>>>>>> C. Michael Barton
>>>>>>> Visiting Scientist, Integrated Science Program
>>>>>>> National Center for Atmospheric Research &
>>>>>>> University Corporation for Atmospheric Research
>>>>>>> 303-497-2889 (voice)
>>>>>>>
>>>>>>> Director, Center for Social Dynamics & Complexity
>>>>>>> Professor of Anthropology, School of Human Evolution & Social Change
>>>>>>> Arizona State University
>>>>>>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Duccio Rocchini, PhD
>>>>>>
>>>>>> http://gis.cri.fmach.it/rocchini/
>>>>>>
>>>>>> Fondazione Edmund Mach
>>>>>> Research and Innovation Centre
>>>>>> Department of Biodiversity and Molecular Ecology
>>>>>> GIS and Remote Sensing Unit
>>>>>> Via Mach 1, 38010 San Michele all'Adige (TN) - Italy
>>>>>> Phone +39 0461 615 570
>>>>>> ducciorocchini at gmail.com
>>>>>> duccio.rocchini at fmach.it
>>>>>> skype: duccio.rocchini
>>>>>
>>>>> _____________________
>>>>> C. Michael Barton
>>>>> Visiting Scientist, Integrated Science Program
>>>>> National Center for Atmospheric Research &
>>>>> University Consortium for Atmospheric Research
>>>>> 303-497-2889 (voice)
>>>>>
>>>>> Director, Center for Social Dynamics & Complexity
>>>>> Professor of Anthropology, School of Human Evolution & Social Change
>>>>> Arizona State University
>>>>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> grass-dev mailing list
>>>>> grass-dev at lists.osgeo.org
>>>>> http://lists.osgeo.org/mailman/listinfo/grass-dev
>>>
>>> _____________________
>>> C. Michael Barton
>>> Visiting Scientist, Integrated Science Program
>>> National Center for Atmospheric Research &
>>> University Corporation for Atmospheric Research
>>> 303-497-2889 (voice)
>>>
>>> Director, Center for Social Dynamics & Complexity
>>> Professor of Anthropology, School of Human Evolution & Social Change
>>> Arizona State University
>>> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>>>
>
> _____________________
> C. Michael Barton
> Visiting Scientist, Integrated Science Program
> National Center for Atmospheric Research &
> University Corporation for Atmospheric Research
> 303-497-2889 (voice)
>
> Director, Center for Social Dynamics & Complexity
> Professor of Anthropology, School of Human Evolution & Social Change
> Arizona State University
> www: http://www.public.asu.edu/~cmbarton, http://csdc.asu.edu
>