[GRASS-SVN] r66859 - in grass-addons/grass7/raster: . r.mregression.series
svn_grass at osgeo.org
svn_grass at osgeo.org
Wed Nov 18 07:10:39 PST 2015
Author: DmitryKolesov
Date: 2015-11-18 07:10:39 -0800 (Wed, 18 Nov 2015)
New Revision: 66859
Added:
grass-addons/grass7/raster/r.mregression.series/
grass-addons/grass7/raster/r.mregression.series/Makefile
grass-addons/grass7/raster/r.mregression.series/r.mregression.series.html
grass-addons/grass7/raster/r.mregression.series/r.mregression.series.py
Log:
Add r.mregression.series module
Added: grass-addons/grass7/raster/r.mregression.series/Makefile
===================================================================
--- grass-addons/grass7/raster/r.mregression.series/Makefile (rev 0)
+++ grass-addons/grass7/raster/r.mregression.series/Makefile 2015-11-18 15:10:39 UTC (rev 66859)
@@ -0,0 +1,7 @@
+MODULE_TOPDIR =../..
+
+PGM = r.mregression.series
+
+include $(MODULE_TOPDIR)/include/Make/Script.make
+
+default: script
Added: grass-addons/grass7/raster/r.mregression.series/r.mregression.series.html
===================================================================
--- grass-addons/grass7/raster/r.mregression.series/r.mregression.series.html (rev 0)
+++ grass-addons/grass7/raster/r.mregression.series/r.mregression.series.html 2015-11-18 15:10:39 UTC (rev 66859)
@@ -0,0 +1,101 @@
+<h2>DESCRIPTION</h2>
+
+<em>r.mregression.series</em> is a module to calculate multiple
+linear regression parameters between several time series, e.g. NDVI and
+elevation, precipitation.
+<p>
+The module makes each output cell value a function of the values
+assigned to the corresponding cells in the input raster map series.
+
+<h2>DESCRIPTION</h2>
+
+The module assumes a simple linear regression of the form
+<div class="code"><pre>
+ Y = b1 * X1 + b2 * X2 + ... + bn * Xn
+</pre></div>
+<p>
+
+<h2>NOTES</h2>
+
+The number of maps in <em>X</em> and <em>Y</em> must be
+identical.
+<p>
+The list of inputs for each cell (including NULLs) is passed to the
+regression function. The functin compute the parameters over the
+non-NULL values, producing a NULL result only if there aren't enough
+non-NULL values for computing.
+<p>
+Linear regression requires an equal number of <em>X</em> and
+<em>Y</em> maps. If the different time series have irregular
+time intervals, NULL raster maps can be inserted into time series to
+make time intervals equal.
+
+<h2>EXAMPLES</h2>
+The most important paramether is <em>samples</em>, it provides the list
+of <em>X</em> and <em>Y</em> maps. The parameter is the name of csv file
+of the next structure: the first line is a header, other lines provide names
+of the <em>X</em> and <em>Y</em> maps. The header contains the names of the
+input and output variables.
+<p>
+For example the csv file for regression between NDVI and (elevation,
+precipitation)
+<div class="code"><pre>
+ NDVI = b1*Elevation + b2*Precipitation
+</pre></div>
+could be the next file:
+<div class="code"><pre>
+y,elevation,precipipation
+ndvi_1,elev_1,precip_1
+ndvi_2,elev_2,precip_2
+...
+ndvi_n,elev_n,precip_n
+</pre></div>
+"ndvi_t" are names of the NDVI rasters, "precip_t" are names
+of precipitation rasters. The names of the first and the
+second variables are "elevation" and "precipitation" accordingly.
+<p>
+The second paramether is <em>result_prefix</em>. It is used for
+coefficient names construction. For example if result_prefix="coef.",
+the names of the regression coefficient names are "coef.elevation"
+and "coef.precipitation".
+
+<div class="code"><pre>
+r.mregression.series samples=settings result_prefix="coef."
+</pre></div>
+
+<p>
+If the regression model includes the intercept
+<div class="code"><pre>
+ NDVI = b0 + b1*Elevation + b2*Precipitation
+</pre></div>
+then the constant map should be used:
+<div class="code"><pre>
+r.mapcalc "ones = 1.0"
+</pre></div>
+and the csv file is:
+<div class="code"><pre>
+y,offset,elevation,precipipation
+ndvi_1,ones,elev_1,precip_1
+ndvi_2,ones,elev_2,precip_2
+...
+ndvi_n,ones,elev_n,precip_n
+</pre></div>
+Then the command
+<div class="code"><pre>
+r.mregression.series samples=settings result_prefix="coef."
+</pre></div>
+produces three raster maps: "coef.offset", "coef.elevation", "coef.precipitation".
+
+<h2>SEE ALSO</h2>
+
+<em><a href="addons/r.regression.series.html">r.regression.series</a></em>,
+<em><a href="r.series.html">r.series</a></em>,
+<em><a href="r.regression.line.html">r.regression.line</a></em>,
+<em><a href="g.list.html">g.list</a></em>,
+<em><a href="g.region.html">g.region</a></em>
+
+<h2>AUTHOR</h2>
+
+Dmitry Kolesov
+
+<p><i>Last changed: $Date: 2014-10-29 02:11:08 +0300 (Ср., 29 окт. 2014) $</i>
Added: grass-addons/grass7/raster/r.mregression.series/r.mregression.series.py
===================================================================
--- grass-addons/grass7/raster/r.mregression.series/r.mregression.series.py (rev 0)
+++ grass-addons/grass7/raster/r.mregression.series/r.mregression.series.py 2015-11-18 15:10:39 UTC (rev 66859)
@@ -0,0 +1,265 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+############################################################################
+#
+# MODULE: r.mregression.series
+#
+# AUTHOR(S): Dmitry Kolesov <kolesov.dm at gmail.com>
+#
+# PURPOSE: multiple regression between two time series
+#
+# COPYRIGHT: (C) 2015 by the GRASS Development Team
+#
+# This program is free software under the GNU General Public
+# License (>=v2). Read the file COPYING that comes with GRASS
+# for details.
+#
+#############################################################################
+
+#%Module
+#% description: Program calculates multiple regression between two series: Y = a1*X1 + ... + an*Xn
+#% overwrite: yes
+#%End
+#%option
+#% key: samples
+#% type: string
+#% gisprompt: file with settings in csv format
+#% description: File contains list of input and output rasters
+#% required : yes
+#% multiple: no
+#%end
+#%option
+#% key: result_prefix
+#% type: string
+#% gisprompt: prefix for names of result rasters
+#% description: Prefix for names of result raster (rasters of regression coefficients)
+#% required : yes
+#% multiple: no
+#%end
+
+
+import os
+import sys
+
+import csv
+import numpy as np
+from numpy.linalg.linalg import LinAlgError
+
+
+if "GISBASE" not in os.environ:
+ sys.stderr.write("You must be in GRASS GIS to run this program.\n")
+ sys.exit(1)
+
+import grass.script as grass
+from grass.pygrass import raster
+from grass.pygrass.gis.region import Region
+
+try:
+ import statsmodels.api as sm
+except ImportError:
+ grass.error("Can't import statsmodels. Install scipy package.")
+ sys.exit(1)
+
+
+CNULL = -2147483648 # null value for CELL maps
+FNULL = np.nan # null value for FCELL and DCELL maps
+
+
+def get_val_or_null(map, row, col):
+ """
+ Return map value of the cell or FNULL (if the cell is null)
+ """
+ value = map.get(row, col)
+ if map.mtype == "CELL" and value == CNULL:
+ value = FNULL
+ return value
+
+
+def ols(y, x):
+ """Ordinary least squares.
+
+ :param x: MxN matrix of data points
+ :param x: numpy.array
+ :param y: vector of M output values
+ :param y: numpy.array
+ :return: vector b of coefficients (x * b = y)
+ """
+ # if a X sample has nan value,
+ # then the fitting procedure crashes.
+ # Delete no-data samples from the data
+ factor_count = x.shape[1]
+ nan_idx = np.logical_or(np.isnan(y), np.isnan(x).any(axis=1))
+ if nan_idx.any():
+ y = y[~nan_idx]
+ x = x[~nan_idx]
+ if x.shape[0] < x.shape[1]:
+ # The system can't be solved
+ return [FNULL for i in range(factor_count)]
+
+ model = sm.OLS(y, x)
+ try:
+ results = model.fit()
+ coefs = results.params
+ except LinAlgError:
+ coefs = [FNULL for i in range(factor_count)]
+ return coefs
+
+
+def get_sample_names(filename, delimiter=','):
+ """
+ Analyse settings file, returns
+ """
+ with open(filename) as settings:
+ reader = csv.reader(settings, delimiter=delimiter)
+ headers = reader.next()
+ inputs = []
+ outputs = []
+ for row in reader:
+ outputs.append(row[0])
+ inputs.append(row[1:])
+
+ return headers, outputs, inputs
+
+
+class DataModel(object):
+ def __init__(self, headers, Y, X, prefix, restype='FCELL'):
+ """Linear Least Square model Y = X * b + e
+ X are [[],.., []] of raster names
+ Y are [] of raster names
+
+ b are rasters of the regression coeficients, the rasters
+ have type restype
+ e are the errors of the model (the class doesn't compute e)
+
+ header is [] of the variable names.
+ names of the 'b' rasters are constructed as
+ prefix+variable_name (see header)
+ """
+ if len(set(headers)) != len(headers):
+ grass.error('The names of the variables are not unique!')
+
+ self.mtype = restype
+
+ self.x_headers = headers[1:] # Names of the coefficient
+ self.b_names = [prefix + name for name in self.x_headers]
+ self.y_names = Y # Names of Y rasters
+ self.x_names = X # Names of X rasters
+
+ self.sample_count = len(self.y_names)
+ self.factor_count = len(self.x_names[0])
+
+ self._y_rasters = []
+ self._x_rasters = []
+ self._b_rasters = []
+ self._init_rasters()
+
+ def x(self, s, f):
+ """Return input map (X) from sample number s and factor number f.
+ """
+ return self._x_rasters[s][f]
+
+ def y(self, s):
+ """Return output map (Y) from sample number s.
+ """
+ return self._y_rasters[s]
+
+ def b(self, s):
+ return self._b_rasters[s]
+
+ def _init_rasters(self):
+ for name in self.y_names:
+ map = raster.RasterSegment(name)
+ if not map.exist():
+ raise ValueError("Raster map %s doesn't exist" % (name, ))
+ self._y_rasters.append(map)
+
+ for names in self.x_names:
+ maps = []
+ for name in names:
+ map = raster.RasterSegment(name)
+ if not map.exist():
+ raise ValueError("Raster map %s doesn't exist" % (name, ))
+ maps.append(map)
+ # Check count of X samples
+ assert len(maps) == self.factor_count
+
+ self._x_rasters.append(maps)
+
+ # Rasters of the regression coefitients
+ for i in range(self.factor_count):
+ name = self.b_names[i]
+ map = raster.RasterSegment(name)
+ self._b_rasters.append(map)
+
+ def open_rasters(self, overwrite):
+ for i in range(self.sample_count):
+ map = self.y(i)
+ map.open()
+
+ for j in range(self.factor_count):
+ map = self.b(j)
+ map.open('w', mtype=self.mtype, overwrite=overwrite)
+ for i in range(self.sample_count):
+ map = self.x(i, j)
+ map.open()
+
+ def close_rasters(self):
+ for i in range(self.sample_count):
+ map = self.y(i)
+ map.close()
+
+ for j in range(self.factor_count):
+ map = self.b(j)
+ map.close()
+ for i in range(self.sample_count):
+ map = self.x(i, j)
+ map.close()
+
+ def get_sample(self, row, col):
+ """Return X and Y matrices for one pixel sample
+ """
+ X = np.empty((self.sample_count, self.factor_count))
+ Y = np.empty(self.sample_count)
+ for snum in range(self.sample_count):
+ y = self.y(snum)
+ Y[snum] = get_val_or_null(y, row, col)
+ for fnum in range(self.factor_count):
+ x = self.x(snum, fnum)
+ X[snum, fnum] = get_val_or_null(x, row, col)
+
+ return Y, X
+
+ def ols(self, overwrite=None):
+ try:
+ reg = Region()
+ self.open_rasters(overwrite=overwrite)
+ rows, cols = reg.rows, reg.cols
+ for r in range(rows):
+ for c in range(cols):
+ Y, X = self.get_sample(r, c)
+ coefs = ols(Y, X)
+ for i in range(self.factor_count):
+ b = self.b(i)
+ b.put(r, c, coefs[i])
+ finally:
+ self.close_rasters()
+ print 'ols'
+
+
+def main(options, flags):
+ samples = options['samples']
+ res_pref = options['result_prefix']
+ if not os.path.isfile(samples):
+ sys.stderr.write("File '%s' doesn't exist.\n" % (samples, ))
+ sys.exit(1)
+
+ headers, outputs, inputs = get_sample_names(samples)
+
+ model = DataModel(headers, outputs, inputs, res_pref)
+ model.ols(overwrite=grass.overwrite())
+ sys.exit(0)
+
+if __name__ == "__main__":
+ options, flags = grass.parser()
+ main(options, flags)
More information about the grass-commit
mailing list