<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body smarttemplateinserted="true" text="#000000" bgcolor="#FFFFFF">

    <br>

    <div class="moz-cite-prefix">On 27-03-16 16:58, Steven Pawley wrote:<br>

    </div>

    <blockquote

cite="mid:5B30AE9B1DA9F412.28C3AF84-E344-4225-BC58-ABB5C28FCEDA@mail.outlook.com"

      type="cite">

      <div id="compose" style="padding-left: 16px; padding-right: 16px;

        padding-bottom: 8px;" contenteditable="true">

        <div>Hello Paulo,</div>

        <div><br>

        </div>

        <div>Many thanks for this. I updated the mode last night to

          include the ability to force regression mode, as well as

          including some more error checking for valid combinations of

          input parameters. Classification mode also checks that the

          input labelled pixels are CELL type. I'm not outputting all of

          the appropriate uncertainty measures like RSQ yet for

          regression mode yet, but I'll add those in.</div>

      </div>

    </blockquote>

    Great, I'll check it out.

    <blockquote

cite="mid:5B30AE9B1DA9F412.28C3AF84-E344-4225-BC58-ABB5C28FCEDA@mail.outlook.com"

      type="cite">

      <div id="compose" style="padding-left: 16px; padding-right: 16px;

        padding-bottom: 8px;" contenteditable="true">

        <div><br>

        </div>

        <div>That is interesting that you had better performance when

          using regression. I will have to check that for my application

          using scikit learn. In R using the randomforest package, the

          results were pretty much identical but my classes were

          balanced already, which I think is one factor that can lead to

          significant differences between binary classification

          probabilities vs regression.<br>

        </div>

      </div>

    </blockquote>

    It was a study by somebody else, I can't remember which one right

    now, but it will come back to me. But yes, the fact that for species

    distribution modeling the sampling is often highly unbalanced (with

    large number of pseudo-absence) is likely to play a role. <br>

    <blockquote

cite="mid:5B30AE9B1DA9F412.28C3AF84-E344-4225-BC58-ABB5C28FCEDA@mail.outlook.com"

      type="cite">

      <div id="compose" style="padding-left: 16px; padding-right: 16px;

        padding-bottom: 8px;" contenteditable="true">

        <div><br>

          Yes definitely will use this as a template to include other

          methods. I Only recently switched my work from R to Python but

          am just submitting a paper based on R which uses a range of

          classifiers like randomforest, GLM, GAM, and MARS which it was

          useful to evaluate the differences.</div>

      </div>

    </blockquote>

    It sometimes seems there are almost as many different conclusions

    about the best method as there are publications (OK, I might

    exaggerate a bit here), so comparing difference models is very

    useful. So very glad you are doing this (as I said, I have looked at

    scipy before and how it could be implemented in GRASS, but my Python

    skills are just not up to it). <br>

    <blockquote

cite="mid:5B30AE9B1DA9F412.28C3AF84-E344-4225-BC58-ABB5C28FCEDA@mail.outlook.com"

      type="cite">

      <div id="compose" style="padding-left: 16px; padding-right: 16px;

        padding-bottom: 8px;" contenteditable="true">

        <div><br>

        </div>

        <div>Steve<br>

          <br>

        </div>

      </div>

      <div class="gmail_quote">_____________________________<br>

        From: Paulo van Breugel <<a moz-do-not-send="true" dir="ltr"

          href="mailto:p.vanbreugel@gmail.com"

          x-apple-data-detectors="true"

          x-apple-data-detectors-type="link"

          x-apple-data-detectors-result="0">p.vanbreugel@gmail.com</a>><br>

        Sent: Sunday, March 27, 2016 3:11 AM<br>

        Subject: Re: [GRASS-dev] RandomForest classifier for imagery

        groups add-on<br>

        To: Vaclav Petras <<a moz-do-not-send="true" dir="ltr"

          href="mailto:wenzeslaus@gmail.com"

          x-apple-data-detectors="true"

          x-apple-data-detectors-type="link"

          x-apple-data-detectors-result="2">wenzeslaus@gmail.com</a>>,

        Steven Pawley <<a moz-do-not-send="true" dir="ltr"

          href="mailto:dr.stevenpawley@gmail.com"

          x-apple-data-detectors="true"

          x-apple-data-detectors-type="link"

          x-apple-data-detectors-result="3">dr.stevenpawley@gmail.com</a>><br>

        Cc: <<a moz-do-not-send="true" dir="ltr"

          href="mailto:grass-dev@lists.osgeo.org"

          x-apple-data-detectors="true"

          x-apple-data-detectors-type="link"

          x-apple-data-detectors-result="4">grass-dev@lists.osgeo.org</a>><br>

        <br>

        <br>

        <meta content="text/html; charset=utf-8">

        Hi Steve <br>

        <br>

        Yes, your user case will not differ methodologically from

        species modeling based on presence/absence. One reason I was

        asking for the regression randomForest is that in one article

        (can't remember the title, will look it up) it was found that

        the regression approach yielded better results, even though the

        response variable is binary. One your help page, you write that

        r.randomforest performs random forest classification and

        regression, and the regression mode can be used by setting the

        mode to the regression option. But I am not seeing that option?

        <br>

        <br>

        Great you are planning other methods as well. Giving model

        uncertainties (quite an issue in species distribution modeling),

        having multiple methods is really a plus, especially as it

        allows one to build consensus models [1] and combine them to

        create uncertainty maps. <br>

        <br>

        Cheers, <br>

        <br>

        Paulo <br>

        <br>

        [1]Marmion, M., Parviainen, M., Luoto, M., Heikkinen, R.K.,

        & Thuiller, W. 2009. Evaluation of consensus methods in

        predictive species distribution modelling. <i>Diversity and

          Distributions</i> 15: 59–69. <br>

        <br>

        <div style="line-height: 1.35; padding-left: 2em;

          text-indent:-2em;" class="csl-bib-body"> <span class="Z3988"

title="url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fzotero.org%3A2&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Evaluation%20of%20consensus%20methods%20in%20predictive%20species%20distribution%20modelling&rft.jtitle=Diversity%20and%20Distributions&rft.volume=15&rft.issue=1&rft.aufirst=M.&rft.aulast=Marmion&rft.au=M.%20Marmion&rft.au=M.%20Parviainen&rft.au=M.%20Luoto&rft.au=R.%20K%20Heikkinen&rft.au=W.%20Thuiller&rft.date=2009&rft.pages=59%E2%80%9369"></span>

        </div>

        <br>

        <div class="moz-cite-prefix"> On 27-03-16 00:47, Steven Pawley

          wrote: <br>

        </div>

        <blockquote>

          <div id="compose" style="padding-left: 20px; padding-right:

            20px; padding-bottom: 8px;">

            <div> Hi Vaclaw and Paulo, </div>

            <div> <br>

            </div>

            <div> Thanks for those pointers re. lazy technique and

              documentation. I have a RandomForest diagram to explain

              the process, as well as some examples, so I'll update

              documentation next week. </div>

            <div> <br>

            </div>

            <div> Paulo thanks for running a few tests. It looks there

              is an error with the class_weight parameter, I'll check

              into that. </div>

            <div> <br>

            </div>

            <div> In terms of species distribution modelling, I have

              been using the tool for landslide susceptibility

              modelling, which I believe is methodologically similar to

              SDM in terms of having a binary response variable. I have

              been doing this for the area of Alberta, using an 8000 x

              14000 pixel and 17 band stack of predictors. In the case

              of a binary response variable, the usual approach is to

              run random forest in classification mode, i.e. with fully

              grown trees, but use the class probabilities to represent

              the 'species' or 'landslide' index. </div>

            <div> <br>

            </div>

            <div> I am planning to implement other methods in the scikit

              learn package, which represents a trivial change to the

              module once he bugs are ironed out. I will probably look

              to create modules for SVM and logistic regression, and

              maybe  nearest neighbours classification. Certainly open

              to any suggestions. </div>

            <div> <br>

            </div>

            <div> Steve </div>

          </div>

          <div class="gmail_quote"> _____________________________ <br>

            From: Vaclav Petras < <a moz-do-not-send="true"

              dir="ltr" href="mailto:wenzeslaus@gmail.com">wenzeslaus@gmail.com</a>>

            <br>

            Sent: Saturday, March 26, 2016 11:21 AM <br>

            Subject: Re: [GRASS-dev] RandomForest classifier for imagery

            groups add-on <br>

            To: Steven Pawley < <a moz-do-not-send="true" dir="ltr"

              href="mailto:dr.stevenpawley@gmail.com">dr.stevenpawley@gmail.com</a>>

            <br>

            Cc: < <a moz-do-not-send="true" dir="ltr"

              href="mailto:grass-dev@lists.osgeo.org">grass-dev@lists.osgeo.org</a>>

            <br>

            <br>

            <br>

            <div dir="ltr">

              <div class="gmail_extra"> <br>

                <div class="gmail_quote"> On Sat, Mar 26, 2016 at 12:40

                  PM, Steven Pawley <span dir="ltr"><<a

                      moz-do-not-send="true"

                      class="moz-txt-link-abbreviated"

                      href="mailto:dr.stevenpawley@gmail.com"><a class="moz-txt-link-abbreviated" href="mailto:dr.stevenpawley@gmail.com">dr.stevenpawley@gmail.com</a></a>></span>

                  wrote: <br>

                  <blockquote class="gmail_quote" style="margin:0px 0px

                    0px 0.8ex;border-left:1px solid

                    rgb(204,204,204);padding-left:1ex"> I would like to

                    draw your attention to a new GRASS add-on,

                    r.randomforest, which uses the scikit-learn and

                    pandas Python packages to classify GRASS rasters. </blockquote>

                </div>

                <br>

              </div>

              <div class="gmail_extra"> Thanks, this looks good. Please

                consider adding an image to the documentation to better

                promote the module [1] and also an example which would

                work with the NC SPM dataset [2]. For the addon to

                generate documentation on the server and work well at

                few other special occasions, it is advantageous to

                employ lazy import technique for the non-standard

                dependencies, see for example <a moz-do-not-send="true"

                  href="http://v.class.ml">v.class.ml</a> and

                v.class.mlpy [3]. <br>

                <br>

              </div>

              <div class="gmail_extra"> Vaclav <br>

              </div>

              <div class="gmail_extra"> <br>

                [1] <a moz-do-not-send="true"

                  href="https://trac.osgeo.org/grass/wiki/Submitting/Docs#Images">https://trac.osgeo.org/grass/wiki/Submitting/Docs#Images</a>

                <br>

                [2] <a moz-do-not-send="true"

                  href="https://grass.osgeo.org/download/sample-data/">https://grass.osgeo.org/download/sample-data/</a>

                <br>

                [3] <a moz-do-not-send="true"

                  href="https://trac.osgeo.org/grass/changeset/66482/">https://trac.osgeo.org/grass/changeset/66482/</a>

                <br>

              </div>

            </div>

            <br>

            <br>

          </div>

        </blockquote>

        <br>

        <br>

        <br>

      </div>

    </blockquote>

    <br>

  </body>

</html>