[GRASS-SVN] r68173 - grass-addons/grass7/raster/r.randomforest

Mon Mar 28 10:37:45 PDT 2016

Author: spawley
Date: 2016-03-28 10:37:45 -0700 (Mon, 28 Mar 2016)
New Revision: 68173

Modified:
   grass-addons/grass7/raster/r.randomforest/r.randomforest.html
Log:
adding r2 to randomforest output

Modified: grass-addons/grass7/raster/r.randomforest/r.randomforest.html
===================================================================

--- grass-addons/grass7/raster/r.randomforest/r.randomforest.html	2016-03-28 17:33:41 UTC (rev 68172)
+++ grass-addons/grass7/raster/r.randomforest/r.randomforest.html	2016-03-28 17:37:45 UTC (rev 68173)
@@ -12,10 +12,8 @@
 
 <br><br>Random forest can also be run in regression mode by setting the <i>mode</i> to the regression option. You also can increase the generalization ability of the classifier by increasing minsplit, which represents the minimum number of samples required in order to split a node. The balanced and class_probabilities options are ignored for regression.
 
-<br><br>The module also offers the potential to save and load a random forests model. The model is saved as a list of filenames for each numpy array. This list can involve a large number of files, so it makes sense to save each model in a separate directory.
+<br><br>The module also offers the ability to save and load a random forests model. The model is saved as a list of filenames (starting with the extension .pkl which is added automatically) for each numpy array. This list can involve a large number of files, so it makes sense to save each model in a separate directory. To load the model, you need to select the .pkl file that was saved. Saving and loading a model represents a useful feature because it allows a model to be built on one imagery group (ie. set of predictor variables), and then the prediction can be performed on other imagery groups. This approach is commonly employed in species prediction modelling, or landslide susceptibility modelling, where a classification or regression model is built with one set of predictors (e.g. which include present-day climatic variables) and then predictions can be performed on other imagery groups containing forecasted climatic variables. The names of the GRASS rasters in the imagery groups
  do not matter because scikit learn saves the model as a series of numpy arrays. However, the new imagery group must contain the same number of rasters, and they should be in the same order as in the imagery group upon which the model was built. As an example, the new imagery group may have a raster named 'mean_precipitation_2050' which substitutes the 'mean_precipitation_2016' in the imagery group that was used to build the model. 
 
-<br><br> Saving a model represents a useful feature because it allows a model to be built on one imagery group (ie. set of predictor variables), and then the prediction can be performed on other imagery groups. This approach is commonly employed in species prediction modelling, or landslide susceptibility modelling, where a classification or regression model is built with one set of predictors (e.g. which include present-day climatic variables) and then predictions can be performed on other imagery groups containing forecasted climatic variables. The names of the GRASS rasters in the imagery groups do not matter because scikit learn saves the model as a series of numpy arrays. However, the new imagery group must contain the same number of rasters, and they should be in the same order as in the imagery group upon which the model was built. As an example, the new imagery group may have a raster named 'mean_precipitation_2050' which substitutes the 'mean_precipitation_2016' in the imag
 ery group that was used to build the model. 
-
 <h2>NOTES</h2>
 
 <em><b>r.randomforest</b></em> uses the scikit-learn machine learning python package, and the pandas package. These python packages need to be installed within your GRASS python environment for <em><b>r.randomforest</b></em> to work. For linux users, both of these packages should be available through the linux package manager in most distributions. For windows users, the easiest way of installing the packages is by using the precompiled binaries from <a href="http://www.lfd.uci.edu/~gohlke/pythonlibs/">Christoph Gohlke</a> and by using the <a href="https://grass.osgeo.org/download/software/ms-windows/">Osgeo4W</a> installation method of GRASS, where the python setuptools can also be installed. You can then use 'easy_install pip' to install the pip package manager. Then, you can download the NumPy-1.10+MKL, scikit-learn and pandas .whl files and install them using 'pip install packagename.whl'.