[GRASS-SVN] r70965 - grass-addons/grass7/raster/r.learn.ml

svn_grass at osgeo.org svn_grass at osgeo.org
Thu Apr 27 09:19:37 PDT 2017


Author: spawley
Date: 2017-04-27 09:19:37 -0700 (Thu, 27 Apr 2017)
New Revision: 70965

Modified:
   grass-addons/grass7/raster/r.learn.ml/r.learn.ml.html
   grass-addons/grass7/raster/r.learn.ml/r.learn.ml.py
   grass-addons/grass7/raster/r.learn.ml/r_learn_utils.py
Log:
r.learn.ml added KNeighborsclassifier

Modified: grass-addons/grass7/raster/r.learn.ml/r.learn.ml.html
===================================================================
--- grass-addons/grass7/raster/r.learn.ml/r.learn.ml.html	2017-04-27 06:50:59 UTC (rev 70964)
+++ grass-addons/grass7/raster/r.learn.ml/r.learn.ml.html	2017-04-27 16:19:37 UTC (rev 70965)
@@ -7,6 +7,8 @@
 	
 	<li><em>LinearDiscriminantAnalysis</em> and <em>QuadraticDiscriminantAnalysis</em> are classifiers with linear and quadratic decision surfaces. These classifiers do not take any parameters and are inherently multiclass. They can only be used for classification.</li>
 	
+	<li><em>KNeighborsClassifier</em> is a simple classification method based on closest distance to a predefined number of training samples and making the prediction from this. Two hyperparameters are exposed in the gui, with <em>n_neighbors</em> governing the number of neighbors to use to decide the prediction label, and <em>weights</em> specifying whether these neighbors should have equal weights or whether they should be inversely weighted by their distance.</li>
+	
 	<li><em>GaussianNB</em> is the Gaussian Naive Bayes algorithm and can be used for classification only. Naive Bayes is a supervised learning algorithm based on applying Bayes theorem with the naive assumption of independence between every pair of features. This classifier does not take any parameters.</li>
 	
 	<li>The <em>DecisionTreeClassifier</em> and <em>DecisionTreeRegressor</em> map observations to a response variable using a hierarchy of splits and branches. The terminus of these branches, termed leaves, represent the prediction of the response variable. Decision trees are non-parametric and can model non-linear relationships between a response and predictor variables, and are insensitive the scaling of the predictors.</li>

Modified: grass-addons/grass7/raster/r.learn.ml/r.learn.ml.py
===================================================================
--- grass-addons/grass7/raster/r.learn.ml/r.learn.ml.py	2017-04-27 06:50:59 UTC (rev 70964)
+++ grass-addons/grass7/raster/r.learn.ml/r.learn.ml.py	2017-04-27 16:19:37 UTC (rev 70965)
@@ -60,7 +60,7 @@
 #% label: Classifier
 #% description: Supervised learning model to use
 #% answer: RandomForestClassifier
-#% options: LogisticRegression,LinearDiscriminantAnalysis,QuadraticDiscriminantAnalysis,GaussianNB,DecisionTreeClassifier,DecisionTreeRegressor,RandomForestClassifier,RandomForestRegressor,ExtraTreesClassifier,ExtraTreesRegressor,GradientBoostingClassifier,GradientBoostingRegressor,SVC,EarthClassifier,EarthRegressor,XGBClassifier,XGBRegressor
+#% options: LogisticRegression,LinearDiscriminantAnalysis,QuadraticDiscriminantAnalysis,KNeighborsClassifier,GaussianNB,DecisionTreeClassifier,DecisionTreeRegressor,RandomForestClassifier,RandomForestRegressor,ExtraTreesClassifier,ExtraTreesRegressor,GradientBoostingClassifier,GradientBoostingRegressor,SVC,EarthClassifier,EarthRegressor,XGBClassifier,XGBRegressor
 #% guisection: Classifier settings
 #% required: no
 #%end
@@ -144,7 +144,24 @@
 #% multiple: yes
 #% guisection: Classifier settings
 #%end
+#%option integer
+#% key: n_neighbors
+#% label: Number of neighbors to use
+#% description: Number of neighbors to use
+#% answer: 5
+#% multiple: yes
+#% guisection: Classifier settings
+#%end
 #%option string
+#% key: weights
+#% label: weight function
+#% description: weight function for knn prediction
+#% answer: uniform
+#% options: uniform,distance
+#% multiple: yes
+#% guisection: Classifier settings
+#%end
+#%option string
 #% key: grid_search
 #% label: Resampling method to use for hyperparameter optimization
 #% description: Resampling method to use for hyperparameter optimization
@@ -392,7 +409,9 @@
         'subsample': options['subsample'],
         'max_depth': options['max_depth'],
         'max_features': options['max_features'],
-        'max_degree': options['max_degree']
+        'max_degree': options['max_degree'],
+        'n_neighbors': options['n_neighbors'],
+        'weights': options['weights']
         }
 
     # cross validation
@@ -459,14 +478,17 @@
     hyperparams_type['C'] = float
     hyperparams_type['learning_rate'] = float
     hyperparams_type['subsample'] = float
+    hyperparams_type['weights'] = str
     param_grid = deepcopy(hyperparams_type)
     param_grid = dict.fromkeys(param_grid, None)
 
     for key, val in hyperparams.iteritems():
         # split any comma separated strings and add them to the param_grid
-        if ',' in val: param_grid[key] = [hyperparams_type[key](i) for i in val.split(',')]
+        if ',' in val:
+            param_grid[key] = [hyperparams_type[key](i) for i in val.split(',')]
         # else convert the single strings to int or float
-        else: hyperparams[key] = hyperparams_type[key](val)
+        else:
+            hyperparams[key] = hyperparams_type[key](val)
 
     if hyperparams['max_depth'] == 0: hyperparams['max_depth'] = None
     if hyperparams['max_features'] == 0: hyperparams['max_features'] = 'auto'

Modified: grass-addons/grass7/raster/r.learn.ml/r_learn_utils.py
===================================================================
--- grass-addons/grass7/raster/r.learn.ml/r_learn_utils.py	2017-04-27 06:50:59 UTC (rev 70964)
+++ grass-addons/grass7/raster/r.learn.ml/r_learn_utils.py	2017-04-27 16:19:37 UTC (rev 70965)
@@ -436,6 +436,7 @@
     from sklearn.ensemble import GradientBoostingClassifier
     from sklearn.ensemble import GradientBoostingRegressor
     from sklearn.svm import SVC
+    from sklearn.neighbors import KNeighborsClassifier
 
     # convert balanced boolean to scikit learn method
     if weights is True:
@@ -559,6 +560,9 @@
             'GaussianNB': GaussianNB(),
             'LinearDiscriminantAnalysis': LinearDiscriminantAnalysis(),
             'QuadraticDiscriminantAnalysis': QuadraticDiscriminantAnalysis(),
+            'KNeighborsClassifier': KNeighborsClassifier(n_neighbors=p['n_neighbors'],
+                                                         weights=p['weights'],
+                                                         n_jobs=n_jobs)
         }
 
     # define classifier
@@ -575,7 +579,8 @@
         or estimator == 'QuadraticDiscriminantAnalysis' \
         or estimator == 'EarthClassifier' \
         or estimator == 'XGBClassifier' \
-        or estimator == 'SVC':
+        or estimator == 'SVC' \
+        or estimator == 'KNeighborsClassifier':
         mode = 'classification'
     else:
         mode = 'regression'



More information about the grass-commit mailing list