[GRASS-dev] RandomForest classifier for imagery groups add-on

Paulo van Breugel p.vanbreugel at gmail.com
Sat Mar 26 10:42:47 PDT 2016


Hi Steve

Great news! I gave it a quick try (on Ubuntu 14.04, GRASS 7 master). 
Size input raster layers: rows: 1578, columns: 1436

*1st try - input full map, classes 1/0, *
I had to stop as it took too much time. Stopping it did not stop the 
python processes however, I had to kill the processes.

*2nd try - input random sample of 100 points, 1 (12) and 0 (88), with b 
flag*
r.randomforest -b igroup=predictors at SampleSize roi=test2 
output=test2_output ntrees=500 mfeatures=-1 minsplit=2 randst=1 lines=100
Group <predictors> references the following raster maps:
Traceback (most recent call last):
   File "/home/paulo/.grass7/addons/scripts/r.randomforest",
line 335, in <module>
     main()
   File "/home/paulo/.grass7/addons/scripts/r.randomforest",
line 243, in main
     class_weight = "balanced", max_features = mfeatures,
min_samples_split = minsplit, random_state = randst)
TypeError: __init__() got an unexpected keyword argument
'class_weight'
Removing raster <tmp_jNyNcqZa>

*3rd try**- input random sample of 100 points, 1 (#12) and 0 (#88), with 
b flag*
r.randomforest igroup=predictors at SampleSize roi=test2 
output=test2_output ntrees=500 mfeatures=-1 minsplit=2 randst=1 lines=100
Group <predictors> references the following raster maps:
Our OOB prediction of accuracy is: 89.0%
                    Raster  Importance
0   bio1_wc30s at SampleSize    0.183670
1   bio2_wc30s at SampleSize    0.139914
2   bio3_wc30s at SampleSize    0.105035
3   bio4_wc30s at SampleSize    0.106413
4  bio13_wc30s at SampleSize    0.087399
5  bio14_wc30s at SampleSize    0.146495
6     dm_wc30s at SampleSize    0.104575
7   llds_wc30s at SampleSize    0.126499
Removing raster <tmp_RhTllKlA>

*Questions*
* I am using it for species distribution modeling (presence/absence 
input map), but I prefer to use the regression mode. Is there a way to 
force it to use the regression mode?
* Are you planning to implement other classification methods? Seems if 
this works it shouldn't be too hard to replace the randomforest method 
by any of the other methods in scipy? I have for som time been thinking 
about using scipy, but my programming skills are not up to standards. 
But perhaps it is easier using your addon as template?

Cheers,

Paulo




On Sat, Mar 26, 2016 at 5:40 PM, Steven Pawley 
<dr.stevenpawley at gmail.com <mailto:dr.stevenpawley at gmail.com>> wrote:

    Hello developers,

    I would like to draw your attention to a new GRASS add-on,
    r.randomforest, which uses the scikit-learn and pandas Python
    packages to classify GRASS rasters. Similar to existing GRASS
    classification methods, it uses an imagery group and a raster of
    labelled pixels as the inputs for the classification. It also reads
    the rasters row-by-row, and then bundles these rows based on a user
    specified row increment to the classifier to keep memory
    requirements low, but also allow efficient classification because
    the scikit-learn implementation is multithreaded by default, and
    row-by-row results in too much stop-start behaviour. The feature
    importance scores and out-of-bag error are displayed in the command
    window.

    I would appreciate testing - you need to have scikit-learn and
    pandas installed in your Python environment which is easy on Linux
    and OS X, and instructions are provided in the tool for Windows.

    I have another add-on that I will upload soon, r.roc, which
    generates ROC and AUROC for prediction models.

    Steve

    Sent from Outlook Mobile <https://aka.ms/sdimjr>


    _______________________________________________
    grass-dev mailing list
    grass-dev at lists.osgeo.org <mailto:grass-dev at lists.osgeo.org>
    http://lists.osgeo.org/mailman/listinfo/grass-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20160326/66da91ed/attachment.html>


More information about the grass-dev mailing list