[GRASS-user] Pansharpening using PCA in R

Mon Apr 5 17:13:24 EDT 2010

Georg Kaspar wrote:
> Hi,
> I'm trying to apply pansharpening to a set of 4 quickbird channels by 
> exchanging the first principal component with the panchromatic channel. 
> Since an inverse PCA is not possible using GRASS (alone) 

Hi Georg!

Theoretically, it takes to multiply the transpose eigenvector matrix
with a back-transformation coefficient. If you use all PC's then you
would get the original data back (right?). If, instead, you do something
with your data before the back-transformation, then you'll get a "new"
spectral space.

Not sure though ;-)

> I used the 
> following workflow:
> 
> R:

> 1. import data into R using spgrass6
> 2. eliminate NAs

* Dunno if it helps here, but some time ago I used the complete.cases()
function which will return a vector with the "TRUE" value for all rows
that are complete (=no NA's). I remember that it was very handy.

> 3. calculate covariance matrix using cov
> 4. calculate eigenvalues and eigenvectors using eigen

In R, you can perform PCA using "princomp()" (the so-called
"eigenvector" solution) or using "prcomp()" (the SVD solution) which is
meant to give higher numerical accuracy.

Then you can export the PC's back to grass.

> GRASS:
> 
> 1. calculate pc-layers by multiplying initial layers with eigenvector * 
> sqrt(eigenvalue)

> 2. since the pc-layers seem to be inverted, create an inverted version 
> of the panchromatic channel rescaled to the range of first pc.

If you don't mind could you explain more on this ("inverted" ? ).

> 3. set region to resolution of panchromatic channel

> 4. calculate pansharpened channels by inverting first step and replacing 
> 1st pc with pan-channel

> Problem is, that the results do not seem to be pansharpened at all...
> Would be nice if someone could have a look at the attached workflow to 
> see if there are any obvious mistakes...

> Best regards & thanks,
> Georg

---

> plain text document attachment (1_pca_r.txt)
> library(spgrass6)
> G = gmeta6()
> # import data from grass
> qbird = readRAST6(c('szene1_mul.1', 'szene1_mul.2', 'szene1_mul.3', 'szene1_mul.4'))

> # work around NAs
> qbird.nas = which(is.na(qbird at data$szene1_mul.1) & is.na(qbird at data$szene1_mul.2) & is.na(qbird at data$szene1_mul.3) & is.na(qbird at data$szene1_mul.4))
> qbird.values = which(!is.na(qbird at data$szene1_mul.1) & !is.na(qbird at data$szene1_mul.2) & !is.na(qbird at data$szene1_mul.3) & !is.na(qbird at data$szene1_mul.4))
> qbird.nonas = qbird at data[qbird.values, ]

--%<---
Here you can use complete.cases(), e.g.:

# make a list first!
qbird.bands <- c(
	'szene1_mul.1',
	'szene1_mul.2',
	'szene1_mul.3',
	'szene1_mul.4')

# import the data
qbird.original <- readRAST6 ( qbird.bands, NODATA=-999999 )

# use complete.cases()
qbird.na_mask <- complete.cases ( qbird )

# get values & check structure
qbird <- qbird.original at data[ qbird.na_mask, ]
str( qbird )

# you may want to add new columns to the sp object for later use:
qbird.original at data$pc1 <- NA
qbird.original at data$pc2 <- NA
qbird.original at data$pc3 <- NA
qbird.original at data$pc4 <- NA

> # calculate covariance matrix
> qbird.cov = cov(qbird.nonas)
> # calculate eigenvalues and eigenvectors of covariance matrix
> eigen(qbird.cov)
> $values
> [1] 20456.53596  6349.30655   132.17551    29.54617
> 
> $vectors
>             [,1]       [,2]        [,3]        [,4]
> [1,]  0.14080443 -0.6333085  0.74920435 -0.13336911
> [2,]  0.11099650 -0.6563785 -0.47295220  0.57720293
> [3,]  0.07533621 -0.3617352 -0.46332379 -0.80548320
> [4,] -0.98090690 -0.1929644  0.01844232 -0.01569314
> plain text document attachment (2_pca_grass.txt)

--%<---
# Perform PCA based on the "Eigenvector solution" using the covariance

# princomp()
qbird.eig.cov <- princomp ( qbird )

# check results / keep only 5 digits
print ( qbird$loadings, digits=5, cutoff=0.001 )
summary ( qbird.eig.cov )

# fill-in the PC's
qbird.original at data$pc1[qbird.na_mask] <- qbird.eig.cov
$scores[,"Comp.1"]
#...and so on for pc2, pc3 and pc4

# write grass raster maps
writeRAST6 ( qbird.original, zcol=5 , vname="qbird.eig.cov.pc.1",
overwrite = TRUE )

> # set region to match extend and resolution of multispectral data
> g.region rast=szene1_mul.1

You should have done this prior to importing grass-maps in R. I think
_this_ mistake pushed me close to my pain-limit not long ago :D

> # calculate pc channels based on pca results in R
> r.mapcalc "szene1_pca.1 = 20.13874 * szene1_mul.1 + 15.87542 * szene1_mul.2 + 10.77506 * szene1_mul.3 - 140.29553 * szene1_mul.4"
> r.mapcalc "szene1_pca.2 = -50.46363 * szene1_mul.1 - 52.30191 * szene1_mul.2 - 28.82398 * szene1_mul.3 - 15.37589 * szene1_mul.4"
> r.mapcalc "szene1_pca.3 = 8.6134232 * szene1_mul.1 - 5.4374184 * szene1_mul.2 - 5.3267228 * szene1_mul.3 + 0.2120269 * szene1_mul.4"
> r.mapcalc "szene1_pca.4 = -0.72494637 * szene1_mul.1 + 3.13746694 * szene1_mul.2 - 4.37831615 * szene1_mul.3 - 0.08530223 * szene1_mul.4"

PC's already there if you create with princomp() or prcomp() and export
them in grass (as in the example above).

Below I can't really tell the Truth :-)

> # set region to match resolution of panchromatic data 
> g.region res=0.6

just curious:
what does "r.info -r PanchromaticRaster" say?
why not use "g.region rast=PanchromaticRaster -pa" ?

> # since 1st pc seems to be inverted when compared to panchromatic channel, an inverted copy of the panchromatic channel is created
> r.mapcalc "szene1_pan_inv=szene1_pan*-1+2047"

Can you use "*-" in r.mapcalc?
Nikos

> # data range of panchromatic channel is set to match the 1st pc
> r.info -r szene1_pca.1
> 	min=-219802.73799
> 	max=-4871.05689
> r.rescale in=szene1_pan_inv out=szene1_pan_inv_re to=-219802.73799,-4871.05689

> # calculate pansharpened channels
> r.mapcalc "szene1_mul_pan.1=20.13874 * szene1_pan_inv_re - 90.57980 * szene1_pca.2 + 107.15596 * szene1_pca.3 - 19.07530 * szene1_pca.4"
> r.mapcalc "szene1_mul_pan.2=8.844483 * szene1_pan_inv_re - 52.301905 * szene1_pca.2 - 37.686031 * szene1_pca.3 + 45.992993 * szene1_pca.4"
> r.mapcalc "szene1_mul_pan.3=0.8661224 * szene1_pan_inv_re - 4.1587827 * szene1_pca.2 - 5.3267228 * szene1_pca.3 - 9.2604477 * szene1_pca.4"
> r.mapcalc "szene1_mul_pan.4=-5.33185610 * szene1_pan_inv_re - 1.04888494 * szene1_pca.2 + 0.10024578 * szene1_pca.3 - 0.08530223 * szene1_pca.4"