[GRASS-SVN] r67104 - in grass-addons/grass7/vector: . v.mrmr

svn_grass at osgeo.org svn_grass at osgeo.org
Sun Dec 13 11:22:27 PST 2015


Author: spawley
Date: 2015-12-13 11:22:27 -0800 (Sun, 13 Dec 2015)
New Revision: 67104

Added:
   grass-addons/grass7/vector/v.mrmr/
   grass-addons/grass7/vector/v.mrmr/Makefile
   grass-addons/grass7/vector/v.mrmr/v.mrmr.html
   grass-addons/grass7/vector/v.mrmr/v.mrmr.py
Log:


Added: grass-addons/grass7/vector/v.mrmr/Makefile
===================================================================
--- grass-addons/grass7/vector/v.mrmr/Makefile	                        (rev 0)
+++ grass-addons/grass7/vector/v.mrmr/Makefile	2015-12-13 19:22:27 UTC (rev 67104)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR = ../..
+
+PGM = v.mrmr
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script


Property changes on: grass-addons/grass7/vector/v.mrmr/Makefile
___________________________________________________________________
Added: svn:eol-style
   + native

Added: grass-addons/grass7/vector/v.mrmr/v.mrmr.html
===================================================================
--- grass-addons/grass7/vector/v.mrmr/v.mrmr.html	                        (rev 0)
+++ grass-addons/grass7/vector/v.mrmr/v.mrmr.html	2015-12-13 19:22:27 UTC (rev 67104)
@@ -0,0 +1,42 @@
+<h2>NAME</h2>
+
+<em><b>v.mrmr</b></em> is a simple GUI for exporting data to the Minimum Redundancy Maximum Relevance (mRMR) feature selection command line tool (Peng et al., 2005)
+
+<h2>PARAMETERS</h2>
+
+<b>vector</b>=<i>name</i> <b>[required]</b>
+<dd>Name of input vector map</dd>
+
+<b>layer</b>=<i>integer</i> <b>[required]</b>
+<dd>Layer number of attribute table to be used in the feature selection</dd>
+
+<b>threshold</b>=<i>double</i>
+<dd>Discretization threshold if attribute table contains continuous data</dd>
+
+<b>nfeatures</b>=<i>integer</i> <b>[required]</b>
+<dd>Number of features to be selected</dd>
+
+<b>nsamples</b>=<i>integer</i> <b>[required]</b>
+<dd>Number of samples to be used in the feature selection</dd>
+
+<b>maxvar</b>=<i>integer</i> <b>[required]</b>
+<dd>Number of attributes to be used in the feature selection</dd>
+
+<h2>DESCRIPTION</h2>
+mRMR is designed to select features that have the maximal statistical "dependency" on the classification variable, while simultaneously minimizing the redundancy among the selected features. 
+
+<br><br> The command line tool needs to be installed separately in a location that is recognized by the system or in the PATH. The command line tool can be installed on windows (binaries available), linux and OS X (needs compilation). Installation instructions are provided on <a href="http://penglab.janelia.org/proj/mRMR/">Peng's Website</a>.
+
+<br><br> The module requires data within a vector attribute table to be arranged in a specific order. The classification variable (i.e., class labels) need to be in the first column, except for the cat attribute which is not exported. The class label also needs to be in numerical form, i.e., 1, 2, 3.... rather than 'forest' or 'urban'.
+
+<br><br>The algorithm outputs a tab-separated list of attributes, ranked by the most important feature first. The <i> method </i> parameter allows a choice between the Maximum Information Difference (MID) and Mutual Information Quotient (MIQ) feature evaluation criteria, which respectively represent the relevancy and redundancy of the features. The algorithm also shows the ranking of the features based on the conventional maximum relevance method. Additional user options include <i>nfeatures</i> which specifies the number of features that you want to select; <i>nsamples</i> limits the maximum number of samples to base the feature selection, and <i>maxvar</i> limits the maximum number of attributes, both of which can therefore reduce the computation for very large datasets. <i>threshold</i> is the discretization threshold to apply to the continuous variable data, i.e., mean +/- threshold * standard deviation. <i> layer </i> is the attribute layer to be used in the feature selection p
 rocess.
+
+<h2>EXAMPLE</h2>
+v.mrmr.py vector=vector_layer layer=1 thres=1.0 nfeatures=50 nsamples=10000 maxvar=10000 method=MID
+
+<h2>REFERENCES</h2>
+Peng, H.; Fulmi Long; Ding, C., "Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy," in Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.27, no.8, pp.1226-1238, Aug. 2005
+
+<h2>AUTHOR</h2>
+Steven Pawley
+<br><i>Last changed: Saturday 12 December 2015</i>
\ No newline at end of file


Property changes on: grass-addons/grass7/vector/v.mrmr/v.mrmr.html
___________________________________________________________________
Added: svn:eol-style
   + native

Added: grass-addons/grass7/vector/v.mrmr/v.mrmr.py
===================================================================
--- grass-addons/grass7/vector/v.mrmr/v.mrmr.py	                        (rev 0)
+++ grass-addons/grass7/vector/v.mrmr/v.mrmr.py	2015-12-13 19:22:27 UTC (rev 67104)
@@ -0,0 +1,121 @@
+#!/usr/bin/env python
+#
+##############################################################################
+#
+# MODULE:       Minimum Redundancy Maximum Relevance Feature Selection
+#
+# AUTHOR(S):    Steven Pawley
+#
+##############################################################################
+#%module
+#% description: Perform Minimum Redundancy Maximum Relevance Feature Selection on a GRASS Attribute Table 
+#%end
+
+#%option G_OPT_V_INPUT
+#% description: Vector features
+#% key: table
+#% required : yes
+#%end
+
+#%option G_OPT_V_FIELD
+#% key: layer
+#% required : yes
+#%end
+
+#%option
+#% description: Discretization threshold
+#% key: threshold
+#% type: double
+#% answer: 1.0
+#% required : no
+#% guisection: Options
+#%end
+
+#%option
+#% description: Number of features (attributes)
+#% key: nfeatures
+#% type: integer
+#% answer: 50
+#% required : yes
+#% guisection: Options
+#%end
+
+#%option
+#% description: Maximum number of samples
+#% key: nsamples
+#% type: integer
+#% answer: 1000
+#% required : yes
+#% guisection: Options
+#%end
+
+#%option
+#% description: Maximum number of variables/attributes
+#% key: maxvar
+#% type: integer
+#% answer: 10000
+#% required : yes
+#% guisection: Options
+#%end
+
+#%option
+#% description: Feature selection method
+#% key: method
+#% type: string
+#% options: MID,MIQ 
+#% answer: MID
+#% required : yes
+#% guisection: Options
+#%end
+
+import sys
+import os
+import subprocess
+import shutil
+
+import grass.script as grass
+import tempfile, atexit
+import os.path
+  
+# env = grass.gisenv()
+# gisdbase = env['GISDBASE']
+# location = env['LOCATION_NAME']
+# mapset = env['MAPSET']
+# path = os.path.join(gisdbase, location, mapset, 'sqlite.db')
+
+tmpdir = tempfile.mkdtemp()
+tmptable = "mrmrdat.csv"
+
+def cleanup():
+    shutil.rmtree(tmpdir)
+    return 0
+
+def main():
+    table = options['table']
+    layer = options['layer']
+    threshold = options['threshold']
+    nfeatures = options['nfeatures']
+    maxvar = options['maxvar']
+    nsamples = options['nsamples']
+    method = options['method']
+
+    os.chdir(tmpdir)
+
+    grass.run_command("v.out.ogr",
+                      input = table,
+                      layer= layer,
+                      type = 'auto',
+                      output = tmpdir + '/' + tmptable,
+                      format = 'CSV',
+                      flags = 's')
+
+    mrmrcmd = 'mrmr -i ' + tmptable +' -m ' + method + ' -t ' + threshold + ' -n ' + nfeatures + ' -s ' + nsamples + ' -v ' + maxvar
+	
+    subprocess.call(mrmrcmd, shell=True)
+
+    return 0
+
+if __name__ == "__main__":
+    options, flags = grass.parser()
+    atexit.register(cleanup)
+    sys.exit(main())


Property changes on: grass-addons/grass7/vector/v.mrmr/v.mrmr.py
___________________________________________________________________
Added: svn:executable
   + *
Added: svn:eol-style
   + native



More information about the grass-commit mailing list