[GRASS-SVN] r68248 - grass-addons/grass7/raster/r.vif

svn_grass at osgeo.org svn_grass at osgeo.org
Mon Apr 11 07:33:47 PDT 2016


Author: pvanbosgeo
Date: 2016-04-11 07:33:47 -0700 (Mon, 11 Apr 2016)
New Revision: 68248

Modified:
   grass-addons/grass7/raster/r.vif/r.vif.html
   grass-addons/grass7/raster/r.vif/r.vif.py
Log:
r.vif addon: correct how output is printed to console, improvements help page

Modified: grass-addons/grass7/raster/r.vif/r.vif.html
===================================================================
--- grass-addons/grass7/raster/r.vif/r.vif.html	2016-04-11 12:07:20 UTC (rev 68247)
+++ grass-addons/grass7/raster/r.vif/r.vif.html	2016-04-11 14:33:47 UTC (rev 68248)
@@ -1,20 +1,24 @@
 <h2>DESCRIPTION</h2>
 
-The <em>r.vif</em> module computes a stepwise variance inflation factor 
-(VIF) and the square root of the VIF. The VIF quantifies how much 
+The <em>r.vif</em> module computes the variance inflation factor 
+(<a href="https://en.wikipedia.org/wiki/Variance_inflation_factor">VIF</a>)
+[1] and the square root of the VIF. The VIF quantifies how much 
 the variance (the square of the estimate's standard deviation) of an 
 estimated regression coefficient is increased because of 
-collinearity. The square root of VIF is a measure of how much larger 
+<a href="https://en.wikipedia.org/wiki/Multicollinearity">multi-collinearity</a>.
+The square root of VIF is a measure of how much larger 
 the standard error is, compared with what it would be if that 
 variable were uncorrelated with the other predictor variables in the 
 model. 
 
-<p>By default the VIF is calculated for each variable. If the user 
-sets a VIF threshold value (maxvif) the VIF will be calculated again 
-after removing the variable with the highest VIF. This will be 
-repeated till the VIF is smaller than maxvif. This can thus be used 
-to select a sub-set of variables for e.g., multiple regression 
-analysis. 
+<p>By default the VIF is calculated for each variable. If the user
+sets a VIF threshold value (maxvif) a stepwise variable selection
+procedure [2] is used whereby after computing the VIF for each
+explanatory variable, the variable with the highest VIF is removed.
+Next, the VIF values are computed again for the reduced set of
+variables. This will be repeated till the VIF is smaller than
+maxvif. This can thus be used to select a sub-set of variables for
+e.g., multiple regression analysis.
 
 <p>The user can optionally select one or more variables to be 
 retained in the stepwise selection. For example, the user selects 
@@ -85,11 +89,6 @@
 Statistics are written to results2.txt
 </pre></div>
 
-<h2>SEE ALSO</h2>
-
-This add-on depends on <em><a href="http://grass.osgeo.org/grass70/manuals/r.regression.multi.html">
-r.regression.multi</a></em>
-
 <h2>Citation</h2>
 
 Suggested citation:
@@ -100,6 +99,14 @@
 10.1007/s10021-015-9938-x.
 
 
+<h2>References</h2>
+[1] Graham, M.H. 2003. Confronting
+multicollinearity in ecological multiple regression. Ecology 84:
+2809–2815.
+[2] Craney, T.A., & Surles, J.G. 2002. Model-Dependent
+Variance Inflation Factor Cutoff Values. Quality Engineering 14:
+391–403.
+
 <h2>AUTHOR</h2>
 
 Paulo van Breugel, paulo at ecodiv.org

Modified: grass-addons/grass7/raster/r.vif/r.vif.py
===================================================================
--- grass-addons/grass7/raster/r.vif/r.vif.py	2016-04-11 12:07:20 UTC (rev 68247)
+++ grass-addons/grass7/raster/r.vif/r.vif.py	2016-04-11 14:33:47 UTC (rev 68248)
@@ -120,12 +120,16 @@
     #=======================================================================
 
     # Calculate VIF and write results to text file
+    name_lengths = []
+    for i in IPF:
+        name_lengths.append(len(i))
+    nlength = max(name_lengths)
     if MXVIF =='':
         text_file = open(OPF, "w")
-        text_file.write("variable\tvif\tsqrtvif\n")
+        text_file.write("variable,vif,sqrtvif\n")
+        grass.info('{0[0]:{1}s} {0[1]:8s} {0[2]:8s}'.format(['variable', 'vif', 'sqrtvif'], nlength))
         for k in xrange(len(IPF)):
             MAPy = IPF[k]
-            nMAPy = IPFn[k]
             MAPx = IPF[:]
             del MAPx[k]
             vifstat = grass.read_command("r.regression.multi",
@@ -134,13 +138,15 @@
             vifstat = vifstat.split('\n')
             vifstat = [i.split('=') for i in vifstat]
             if float(vifstat[1][1]) > 0.9999999999:
-                rsqr = 0.9999999999
+                vif = float("inf")
+                sqrtvif = float("inf")
             else:
                 rsqr = float(vifstat[1][1])
-            vif = 1 / (1 - rsqr)
-            sqrtvif = math.sqrt(vif)
-            text_file.write(nMAPy + "\t" + str(round(vif, 3)) + "\t" + str(round(sqrtvif, 3)) + "\n")
-            print("VIF " + MAPy + " = " + str(vif))
+                vif = 1 / (1 - rsqr)
+                sqrtvif = math.sqrt(vif)
+            RES = [MAPy, vif, sqrtvif]
+            text_file.write('{0[0]}, {0[1]}, {0[2]}\n'.format(RES))
+            grass.info('{0[0]:{1}s} {0[1]:8.2f} {0[2]:8.2f}'.format(RES, nlength))
         text_file.close()
     else:
         text_file = open(OPF, "w")
@@ -153,9 +159,9 @@
             grass.info("----------------------------------------")
             rvif = np.zeros(len(IPF))
             text_file.write("variable\tvif\tsqrtvif\n")
+            grass.info('{0[0]:{1}s} {0[1]:8s} {0[2]:8s}'.format(['variable', 'vif', 'sqrtvif'], nlength))
             for k in xrange(len(IPF)):
                 MAPy = IPF[k]
-                nMAPy = IPFn[k]
                 MAPx = IPF[:]
                 del MAPx[k]
                 vifstat = grass.read_command("r.regression.multi",
@@ -164,17 +170,19 @@
                 vifstat = vifstat.split('\n')
                 vifstat = [i.split('=') for i in vifstat]
                 if float(vifstat[1][1]) > 0.9999999999:
-                    rsqr = 0.9999999999
+                    vif = float("inf")
+                    sqrtvif = float("inf")
                 else:
                     rsqr = float(vifstat[1][1])
-                vif = 1 / (1 - rsqr)
-                sqrtvif = math.sqrt(vif)
-                text_file.write(nMAPy + "\t" + str(round(vif, 3)) + "\t" + str(round(sqrtvif, 3)) + "\n")
+                    vif = 1 / (1 - rsqr)
+                    sqrtvif = math.sqrt(vif)
+                RES = [MAPy, vif, sqrtvif]
+                text_file.write('{0[0]}, {0[1]}, {0[2]}\n'.format(RES))
+                grass.info('{0[0]:{1}s} {0[1]:8.2f} {0[2]:8.2f}'.format(RES, nlength))
                 if IPFn[k] in IPRn:
                     rvif[k] = -9999
                 else:
                     rvif[k] = vif
-                print("VIF " + MAPy + " = " + str(vif))
 
             rvifmx = max(rvif)
             if rvifmx >= MXVIF:
@@ -198,7 +206,8 @@
         grass.info(', '.join(IPFn))
         grass.info("with as maximum VIF: " + str(rvifmx))
     grass.info("")
-    grass.info("Statistics are written to " + OPF)
+    if options['file'] != '':
+        grass.info("Statistics are written to " + OPF + "\n")
     grass.info("")
 
 if __name__ == "__main__":



More information about the grass-commit mailing list