[GRASS-SVN] r66809 - grass-addons/grass7/vector/v.class.mlR

Wed Nov 11 10:19:42 PST 2015

Author: mlennert
Date: 2015-11-11 10:19:42 -0800 (Wed, 11 Nov 2015)
New Revision: 66809

Modified:
   grass-addons/grass7/vector/v.class.mlR/v.class.mlR.html
   grass-addons/grass7/vector/v.class.mlR/v.class.mlR.py
Log:
Adding some robustness and information messages and a bit more explanation to the manual


Modified: grass-addons/grass7/vector/v.class.mlR/v.class.mlR.html
===================================================================

--- grass-addons/grass7/vector/v.class.mlR/v.class.mlR.html	2015-11-11 17:57:59 UTC (rev 66808)
+++ grass-addons/grass7/vector/v.class.mlR/v.class.mlR.html	2015-11-11 18:19:42 UTC (rev 66809)
@@ -1,42 +1,56 @@
 <h2>DESCRIPTION</h2>
 
+<p>
 <em>v.class.mlR</em> uses machine learning functions in R to classify
 features in a vector map using training features in a second map for 
 supervised learning.
 
+<p>
 At the current stage it is just a quick and dirty hack to allow students to 
 do such classification in the framework of a course. It is meant as a very
 simplistic alternative to v.class.ml which can be a bit overwhelming for
 newbies.
 
+<p>
 Currently, only support vector machine classification is implemented, using
 the e1071 CRAN contrib package (which is automatically installed if necessary).
 The user has to chose between different kernel types. The module then goes
 through tuning across a range of possible parameters using 10-fold cross-
-validation. Optionally, the user can fix certain parameters so that they 
-will be excluded from tuning.
+validation. Optionally, the user can determine fixed values for certain 
+parameters so that they will be excluded from tuning. This speeds up the tuning
+process.
 
 <h2>NOTES</h2>
 
+<p>
 The module automatically excludes columns that contain empty values in the map
 of all features.
 
+<p>
+Running the same model with exactly the same specifications generally leads
+to differing classification errors. This is due to the cross-validation for
+which training and validation sets are drawn randomly. The user can run the
+same model call several times to get an idea of the variances of the error.
+
+<p>
 The module can be used in a tool chain together with <a href="i.segment.html">i.segment</a>
 and the addon <em>i.segment.stats</em> for object-based classification of 
 satellite imagery.
 
 <h2>TODO</h2>
 
+<p>
 If the module is deemed to deserve a longer life-span, than it should possibly
 be recoded to use rpy2 instead of simple text batch files for R (although this
 latter solution is quite useful in a computer lab where rpy2 is not installed).
 
+<p>
 Other classifiers should be included.
 
 <h2>EXAMPLE</h2>
 
 <div class="code"><pre>
-v.class.mlR segments training=basins_training2 classcol=classe outcol=classe_watershed
+v.class.mlR segments training=training_areas classcol=class outcol=class_linear kernel=linear
 </pre></div>
 
 <h2>SEE ALSO</h2>

Modified: grass-addons/grass7/vector/v.class.mlR/v.class.mlR.py
===================================================================
--- grass-addons/grass7/vector/v.class.mlR/v.class.mlR.py	2015-11-11 17:57:59 UTC (rev 66808)
+++ grass-addons/grass7/vector/v.class.mlR/v.class.mlR.py	2015-11-11 18:19:42 UTC (rev 66809)
@@ -1,7 +1,7 @@
 #!/usr/bin/env python
 ############################################################################
 #
-# MODULE:       v.class.Re1071svm.py
+# MODULE:       v.class.mlR
 # AUTHOR:       Moritz Lennert
 # PURPOSE:      Provides supervised machine learning based classification
 #               (using support vector machine from R package e1071)
@@ -44,36 +44,49 @@
 #% required: yes
 #%end
 #%option
+#% key: classifier
+#% type: string
+#% description: Classifier to use
+#% required: yes
+#% options: svm
+#% answer: svm
+#% end
+#%option
 #% key: kernel
 #% type: string
 #% description: Kernel to use
 #% required: yes
 #% options: linear,polynomial,radial,sigmoid
 #% answer: linear
+#% guisection: svm
 #%end
 #%option
 #% key: cost
 #% type: double
 #% description: cost value
 #% required: no
+#% guisection: svm
 #%end
 #%option
 #% key: degree
 #% type: double
 #% description: degree value (for polynomial kernel)
 #% required: no
+#% guisection: svm
 #%end
 #%option
 #% key: gamma
 #% type: double
 #% description: gamma value (for all kernels except linear)
 #% required: no
+#% guisection: svm
 #%end
 #%option
 #% key: coeff0
 #% type: double
 #% description: coeff0 value (for polynomial and sigmoid kernels)
 #% required: no
+#% guisection: svm
 #%end
 
 import atexit
@@ -130,19 +143,24 @@
     r_file = open(r_commands, 'w')
 
     install = "if(!is.element('e1071', installed.packages()[,1])) "
-    install += "{install.packages('e1071', "
+    install += "{cat('\n\nInstalling e1071 package from CRAN\n\n')\n"
+    install += "install.packages('e1071', "
     install += "repos='https://mirror.ibcp.fr/pub/CRAN/')}"
     r_file.write(install)
     r_file.write("\n")
     r_file.write('library(e1071)')
     r_file.write("\n")
+    r_file.write("cat('\nRunning R to tune and apply model...\n')")
+    r_file.write("\n")
     r_file.write('features<-read.csv("%s", sep="|", header=TRUE)' % feature_vars)
     r_file.write("\n")
     r_file.write("features<-features[sapply(features, function(x) !any(is.na(x)))]") 
     r_file.write("\n")
     r_file.write('training<-read.csv("%s", sep="|", header=TRUE)' % training_vars)
     r_file.write("\n")
-    r_file.write("training = data.frame(training[names(features)], classe=training$%s)" % classcol)
+    data_string = "training = data.frame(training[names(training)[names(training)"
+    data_string += "%%in%% names(features)]], classe=training$%s)" % classcol
+    r_file.write(data_string)
     r_file.write("\n")
     r_file.write("training$%s <- as.factor(training$%s)" % (classcol, classcol))
     r_file.write("\n")
@@ -193,7 +211,7 @@
     r_file.write("\n")
     r_file.write("cat(model$best.performance)")
     r_file.write("\n")
-    r_file.write("cat('\n\nSpecification of best model:')")
+    r_file.write("cat('\n\nTuning call and specification of best model:')")
     r_file.write("\n")
     r_file.write("print(model$best.model)")
     r_file.write("\n")
@@ -205,7 +223,6 @@
     r_file.write("\n")
     r_file.close()
 
-    grass.message('Running R to tune and apply "best" model')
     subprocess.call(['Rscript', r_commands])
 
     f = open(model_output_desc, 'w')