[GRASS-dev] r.regression.linear

Helena Mitasova hmitaso at unity.ncsu.edu
Mon Oct 22 23:40:51 EDT 2007


I just found that r.regression.linear has various terminology problems
that may cause confusion about what it actually computes. One can  
look at the script
to check the exact equations but how many users actually do that?

Do we have somebody who has a good knowledge of english stats  
terminology
to fix it (I don't trust mine very much)?

For example,

1. help says:
R: sumXY - sumX*sumY/tot

but the script computes "correlation coefficient" (e.g. J.H. Zar:  
Biostatistical analysis)
which does not imply dependence of Y on X and
it is rather a measure of intensity of association between X and Y

R= (sumXY - sumX*sumY/tot)/((sumsqX - sumX^2/tot)*(sumsqY - sumY^2/ 
tot))^0.5;

so should the help say

R: correlation coefficient

rather then list initial part of the equation used?
(for linear regression it should be R^2)

2.
map1 should be map for x-variable (if we talk about linear regression  
- map for independent variable x)
map2 should be map for y-variable (if we talk about linear regression  
- map for dependent variable y)

instead of map for x coefficient and map for y coefficient

3. it says median but it actually computes mean?

4. what is this called F= R^2/(1-R^2/tot-2)

there may be more, so it would be great if somebody with stats  
expertise could look at it
- it is quite short and simple (but useful) script

Helena

Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/







More information about the grass-dev mailing list