[GRASS-SVN] r55398 - in grass-addons/grass7/raster: . r.edm.eval

svn_grass at osgeo.org svn_grass at osgeo.org
Sat Mar 16 03:29:21 PDT 2013


Author: pvanbosgeo
Date: 2013-03-16 03:29:21 -0700 (Sat, 16 Mar 2013)
New Revision: 55398

Added:
   grass-addons/grass7/raster/r.edm.eval/
   grass-addons/grass7/raster/r.edm.eval/r.edm.eval
   grass-addons/grass7/raster/r.edm.eval/r.edm.eval.html
Log:
Adding an shell script that allows the evaluation of environmental distribution models. The script requires R (it is essentially a wrapper for some R code) and is tested with grass 7.0 on Ubuntu 12.10

Added: grass-addons/grass7/raster/r.edm.eval/r.edm.eval
===================================================================
--- grass-addons/grass7/raster/r.edm.eval/r.edm.eval	                        (rev 0)
+++ grass-addons/grass7/raster/r.edm.eval/r.edm.eval	2013-03-16 10:29:21 UTC (rev 55398)
@@ -0,0 +1,397 @@
+#!/bin/sh
+# 
+#set -x
+########################################################################
+# 
+# MODULE:       r.model.eval
+# AUTHOR(S):    Paulo van Breugel <p.vanbreugel AT gmail.com>
+# PURPOSE:      To evaluates how well a modeled distribution predicts an 
+#               observed distribution. 
+#
+# NOTES:        The observed distribution of e.g., a species, land
+#               cover unit or vegetation unit should be a binary map
+#               with 1 (present) and 0 (absent). The values of the
+#               modeled distribution can be any map that represents a 
+#               probability distribution in space. This could be based 
+#               on X, but it doesn't need to be. You can also, for example,
+#               evaluate how well the modeled distribution of a species X
+#               predicts the distribution of species Y or of land cover
+#               type Y.
+#
+# Disclaimer:   Only limited testing has been done. Use at own risk
+#   
+# COPYRIGHT: (C) 2013 Paulo van Breugel
+#            http://ecodiv.org
+#            http://pvanb.wordpress.com/
+# 
+#            This program is free software under the GNU General Public 
+#            License (>=v2). Read the file COPYING that comes with GRASS 
+#            for details. 
+# 
+########################################################################
+#
+#%Module 
+#% description: Computes multivariate environmental similarity surface
+#%End 
+
+#%option
+#% key: obs
+#% type: string
+#% gisprompt: old,cell,raster
+#% description: Observed distribution
+#% key_desc: Raster name
+#% required: yes
+#% multiple: no
+#%end
+
+#%option
+#% key: mod
+#% type: string
+#% gisprompt: old,cell,raster
+#% description: Modeled distribution
+#% key_desc: Raster name
+#% required: yes
+#% multiple: no
+#%end
+
+#%option
+#% key: fstats
+#% type: string
+#% description: Name of output files (without extension)
+#% key_desc: File name
+#% required: yes
+#% multiple: no
+#%end
+
+#%option
+#% key: n_bins
+#% type: integer
+#% description: Number of bins in which to divide the modeled distribution scores
+#% key_desc: integer
+#% answer: 200
+#% required: no
+#%end
+
+#%option
+#% key: buffer_abs
+#% type: integer
+#% description: Restrict absence area to x km buffer
+#% key_desc: buffer zone (km)
+#% answer: 0
+#% required: no
+#% guisection: evaluation area
+#%end
+
+#%option
+#% key: buffer_pres
+#% type: integer
+#% description: Restrict presence area to x km buffer
+#% key_desc: buffer zone (km)
+#% answer: 0
+#% required: no
+#% guisection: evaluation area
+#%end
+
+#%flag
+#% key: p
+#% description: Save png image of ROC plot
+#% guisection: ROC plot
+#%end
+
+#%flag
+#% key: s
+#% description: Save svg file of ROC plot
+#% guisection: ROC plot
+#%end
+
+#=======================================================================
+## GRASS team recommandations
+#=======================================================================
+
+## Check if in GRASS
+if  [ -z "$GISBASE" ] ; then
+    echo "You must be in GRASS GIS to run this program." 1>&2
+    exit 1
+fi
+
+## check for awk
+if [ ! -x "$(which awk)" ] ; then
+    g.message -e "<awk> required, please install <awk> or <gawk> first"
+    exit 1
+fi
+
+## To parse the code into interactive menu
+if [ "$1" != "@ARGS_PARSED@" ] ; then
+    exec g.parser "$0" "$@"
+fi
+
+## set environment so that awk works properly in all languages ##
+unset LC_ALL
+export LC_NUMERIC=C
+
+
+## what to do in case of user break:
+exitprocedure()
+{
+    echo "User break!"
+    cleanup
+    exit 1
+}
+
+## shell check for user break (signal list: trap -l)
+trap "exitprocedure" 2 3 15
+
+#=======================================================================
+## Config and general procedures
+#=======================================================================
+
+##fix this path
+if [ -z "$PROCESSDIR" ] ; then
+	PROCESSDIR="$HOME"
+fi
+
+#fix this path
+if [ -z "$LOGDIR" ] ; then
+	LOGDIR="$HOME"
+fi
+LOGFILE="$LOGDIR/r_model_eval.log"
+
+echo "r.model.eval :" >> "$LOGFILE"
+
+#=======================================================================
+# test for missing input raster maps
+#=======================================================================
+
+oIFS=$IFS
+IFS=,
+arrIN=${GIS_OPT_OBS} `echo $nvar | awk 'BEGIN{FS="@"}{print $1}'`
+g.findfile element=cell file=${arrIN} > /dev/null
+if [ $? -gt 0 ] ; then 
+    g.message -e 'The output map '${arrIN}' does not exists'
+exit 1
+fi
+unset arrIN
+
+arrIN=${GIS_OPT_MOD} `echo $nvar | awk 'BEGIN{FS="@"}{print $1}'`
+g.findfile element=cell file=${arrIN} > /dev/null
+if [ $? -gt 0 ] ; then 
+    g.message -e 'The output map '${arrIN}' does not exists'
+exit 1
+fi
+IFS=$oIFS
+unset arrIN
+
+#=======================================================================
+## Creating the R script
+#=======================================================================
+
+writeScript(){ 
+cat > $1 << "EOF"
+
+options(echo = TRUE)
+library(spgrass6)
+library(reshape)
+library(caTools)
+
+# Define required functions
+get_output.GRASS <- function(x, separator=",", h=FALSE){
+      con <- textConnection(x)
+      MyVar <- read.table(con, header=h, sep=separator, comment.char="")
+      close(con)
+      return(MyVar)
+}
+
+rand.name <- function(base="tmp", l=6){
+    paste(base,paste(sample(c(0:9, letters, LETTERS), l, replace=TRUE), collapse=""), sep="_")
+}
+
+copy_rast.GRASS <- function(rast, remove.orig=FALSE){
+    require(spgrass6)
+    mpst <- gmeta6()$MAPSET
+    fft <- execGRASS("g.findfile", flags="quiet", element="cell", file=rast, mapset=mpst, Sys_ignore.stdout=TRUE)
+    if(fft==0){
+        rname <- rand.name()
+        execGRASS("g.copy", rast=paste(rast, rname, sep=","))
+        if(remove.orig){execGRASS("g.remove", rast=rast)}
+        return(rname)
+    }else{
+        print("the selected file does not exist")
+    }
+}
+
+# Check grass version
+grassversion <- as.numeric(system("g.version | cut -c7", intern=T))
+
+## Get vector with variables
+args <- commandArgs(trailingOnly=TRUE)
+obsmap <- args[1]
+distmod <- args[2]
+n.bins <- as.numeric(args[3])
+bufabs <- as.numeric(args[4])
+bufpres <- as.numeric(args[5])
+plot.png <- as.numeric(args[6])
+plot.svg <- as.numeric(args[7])
+fnames <- args[8]
+
+# Create buffer map with buffer width of bufabs (outside) or bufpres (inside)
+if(bufabs>0){
+    tmp.a <- "tmp_r_model_evaluation_0987654321a"
+    tmp.b <- "tmp_r_model_evaluation_0987654321b"
+	execGRASS("r.mapcalc", flags="overwrite", expression=paste(tmp.a, obsmap, sep=" = "))
+	execGRASS("r.null", map=tmp.a, setnull="0")
+	execGRASS("r.buffer", input=tmp.a, output=tmp.b, distances=bufabs, units="kilometers")
+	sink("reclass.rules"); cat("1:1:1:1"); cat("\n"); cat("2:2:0:0"); cat("\n"); sink()
+    execGRASS("r.recode", flags="overwrite", input=tmp.b, output=tmp.a, rules="reclass.rules")
+	obsmap <- tmp.a
+	execGRASS("g.remove", rast=tmp.b); unlink("reclass.rules")
+}
+
+if(bufpres>0){
+    tmp.b <- "tmp_r_model_evaluation_0987654321b"
+    tmp.c <- "tmp_r_model_evaluation_0987654321c"
+    execGRASS("r.mapcalc", flags="overwrite", expression=paste(tmp.b," = if(", obsmap,"==1,0,null())", sep=""))
+    execGRASS("r.buffer", flags=c("z","overwrite"), input=tmp.b, output=tmp.c, distances=bufpres, units="kilometers")        
+    execGRASS("r.mapcalc", flags="overwrite", expression=paste(tmp.b, " = if(", tmp.c, "==2,1,", obsmap, ")", sep=""))
+    obsmap <- tmp.b
+    execGRASS("g.remove", rast=tmp.c)
+}
+    
+# Create map with probability scores grouped in n.bins bins
+a <- execGRASS("r.univar", flags="g", map=distmod, separator=";", intern=TRUE)
+maxmin <- get_output.GRASS(a, separator="=")
+stps <- (maxmin[5,2] - maxmin[4,2])/n.bins
+roc.st <- seq(maxmin[4,2], maxmin[5,2], by=stps)
+x1 <- c(0,roc.st[-length(roc.st)])
+x2 <- c(roc.st[1],roc.st[-1])
+y1 <- seq(length(roc.st)-1,0,-1)
+xy1 <- format(cbind(x1,x2,y1), scientific=F, trim=T, nsmall=12)
+xy2 <- apply(xy1, 1, function(x){paste(x, collapse=":")})
+tmp.rule <- tempfile(pattern = "file", tmpdir = tempdir(), fileext = ".rules")
+write.table(xy2, file=tmp.rule, quote=F, row.names=F, col.names=F)
+tmp.d <- "tmp_r_model_evaluation_0987654321d"
+execGRASS("r.recode", flags="overwrite", input=distmod, output=tmp.d, rules=tmp.rule)
+unlink(tmp.rule)
+
+# Get zonal stats
+a <- execGRASS("r.stats", flags=c("c","n"), input=paste(obsmap, tmp.d, sep=","), intern=TRUE)
+a <- get_output.GRASS(a, separator=" ")
+execGRASS("g.remove", rast=tmp.d)
+
+# Clean up 
+if(bufabs>0){execGRASS("g.remove", rast=tmp.a)}
+if(bufabs>0){execGRASS("g.remove", rast=tmp.b)}
+
+# Calculate evaluation stats
+a.cast <- cast(a, V2 ~ V1)
+a.cast[] <- sapply(a.cast, function(x) replace(x,is.na(x),0))
+a.sum <- apply(a.cast,2,sum)
+a.stats <- cbind(a.cast,cumsum(a.cast[,2]),cumsum(a.cast[,3]))[,-c(2,3)]
+names(a.stats) <- c("bin", "FP", "TP")
+a.stats$TN <- a.sum[1]-a.stats$FP
+a.stats$FN <- a.sum[2]-a.stats$TP
+a.stats$TPR <- apply(a.stats,1,function(x){x[3]/(x[3]+x[5])})
+a.stats$FPR <- apply(a.stats,1,function(x){x[2]/(x[2]+x[4])})
+a.stats$TNR <- apply(a.stats,1,function(x){x[4]/(x[2]+x[4])})
+a.stats$spec_sens <- apply(a.stats,1,function(x){x[6]+x[8]})
+ 
+# Calculate evaluation stats
+a.auc <- trapz(a.stats$FPR, a.stats$TPR)
+a.spec.sens <- max(a.stats$spec_sens)
+threshold.bin <- n.bins - a.stats[which.max(a.stats$spec_sens),1]
+a.spec.sens_threshold <- x2[threshold.bin]
+
+PRa <- apply(a.stats,1,function(x){(x[3]+x[4])/sum(x)})
+Yobs <- apply(a.stats,1,function(x){(x[3]+x[5])/sum(x)})
+Ymod <- apply(a.stats,1,function(x){(x[3]+x[2])/sum(x)})
+PRe <- Yobs*Ymod + (1-Yobs)*(1-Ymod)
+a.kappa <- (PRa - PRe) / (1 - PRe)
+a.kappa_max <- max(a.kappa)
+threshold.bin <- n.bins - a.stats[which.max(a.kappa),1]
+a.kappa_threshold <- x2[threshold.bin]
+
+# Save plot as svg image
+if(plot.svg==1){
+    svg.name <- paste(fnames, "svg", sep=".")
+    svg(svg.name, width=6, height=4.8, pointsize=11)
+    par(mar=c(4.5,4.5,1,1))
+    plot(a.stats$TPR ~ a.stats$FPR, xlab="False positive rate", ylab="True positive rate", type='n')
+    abline(v=a.kappa_threshold, col=rgb(0.8,0.5,0,0.3), lty=3, lwd=3)
+    abline(v=a.spec.sens_threshold, col=rgb(0,0.5,0.8, 0.3), lty=3, lwd=3)
+    lines(a.stats$TPR ~ a.stats$FPR, lwd=3, col=rgb(0,0.4,0))
+    lines(x=c(0,1), y=c(0,1), lty=5, col=rgb(0.8,0.8,0.8), lwd=3) 
+    leg.text <- c(paste("AUC:", round(a.auc,2)), paste("Max. kappa:", round(a.kappa_max, 2)), paste("Max. specsens:", round(a.spec.sens,2)))
+    legend(x="bottomright", lty=c(1,3,3), col=c(rgb(0,0.4,0), rgb(0.8,0.5,0,0.5), rgb(0,0.5,0.8, 0.5)), legend=leg.text, inset=0.05, bty="n", cex=0.8, seg.len=3, lwd=3) 
+    dev.off()
+ }
+
+# Save plot as png image
+ if(plot.png==1){
+    png.name <- paste(fnames, "png", sep=".")
+    png(png.name, width=960, height=880, pointsize=28)
+    par(mar=c(3.5,3.5,1,1), mgp=c(2,0.5,0))
+    plot(a.stats$TPR ~ a.stats$FPR, xlab="False positive rate", ylab="True positive rate", type='n', cex.axis=0.9)
+    abline(v=a.kappa_threshold, col=rgb(0.8,0.5,0,0.3), lty=3, lwd=3)
+    abline(v=a.spec.sens_threshold, col=rgb(0,0.5,0.8, 0.3), lty=3, lwd=3)
+    lines(a.stats$TPR ~ a.stats$FPR, lwd=3, col=rgb(0,0.4,0))
+    lines(x=c(0,1), y=c(0,1), lty=5, col=rgb(0.8,0.8,0.8), lwd=3) 
+    leg.text <- c(paste("AUC:", round(a.auc,2)), paste("Max. Kappa:", round(a.kappa_max, 2)), paste("Max. SpecSens:", round(a.spec.sens,2)))
+    legend(x="bottomright", lty=c(1,3,3), col=c(rgb(0,0.4,0), rgb(0.8,0.5,0,0.5), rgb(0,0.5,0.8, 0.5)), legend=leg.text, inset=0.05, bty="n", cex=0.8, seg.len=3, lwd=3)
+    dev.off()
+}
+
+sink(paste(fnames, "txt", sep="."))
+cat(paste("AUC", as.character(a.auc), sep="=")); cat("\n") 
+cat(paste("spec.sens", a.spec.sens, sep="=")); cat("\n")
+cat(paste("max spec.sens threshold", a.spec.sens_threshold, sep="=")); cat("\n") 
+cat(paste("max kappa", a.kappa_max, sep="=")); cat("\n") 
+cat(paste("max kappa threshold", a.kappa_threshold, sep="=")); cat("\n") 
+cat(paste("stats txt file=", paste(fnames, "txt", sep="."), sep=" ")); cat("\n")
+if(plot.png==1){
+	cat(paste("png file with ROC curve", png.name, sep="=")); cat("\n")
+}else{cat("png file with ROC curve=no"); cat("\n")}
+if(plot.svg==1){
+	cat(paste("svg file with ROC curve", svg.name, sep="=")); cat("\n")
+}else{cat("svg file with ROC curve=no"); cat("\n")}
+cat("\n")
+cat("-------------------------------------------------------"); cat("\n") 
+cat("\n")
+sink()
+write.table(a.stats, file=paste(fnames, "txt", sep="."), append=TRUE, sep=";", row.names=FALSE)
+
+EOF
+}
+
+# RGrass script generation
+# --------------------------
+RGRASSSCRIPT="`g.tempfile pid=$$`"
+if [ $? -ne 0 ] || [ -z "$RGRASSSCRIPT" ] ; then
+	g.message -e 'ERROR: unable to create temporary file for RGrass script' 1>&2
+    exit 1
+fi
+writeScript "$RGRASSSCRIPT"
+
+
+#=======================================================================
+## RGrass call
+#=======================================================================
+
+# Message
+g.message message='Working on it...'
+g.message message='-----'
+g.message message=''
+
+##using R to create MESS layers
+#--slave 
+R --no-save --no-restore --no-site-file --no-init-file --args "${GIS_OPT_OBS}" "${GIS_OPT_MOD}" "${GIS_OPT_N_BINS}" "${GIS_OPT_BUFFER_ABS}" "${GIS_OPT_BUFFER_PRES}" "${GIS_FLAG_P}" "${GIS_FLAG_S}" "${GIS_OPT_FSTATS}" < "$RGRASSSCRIPT" >> "$LOGFILE" 2>&1
+if [ $? -ne 0 ] ; then
+	echo "ERROR: an error occurred during R script execution" 1>&2
+    exit 1
+fi
+
+head -8 ${GIS_OPT_FSTATS}.txt;
+
+g.message message='-----'
+g.message "Don't forget to check out the output files"
+
+#=======================================================================
+


Property changes on: grass-addons/grass7/raster/r.edm.eval/r.edm.eval
___________________________________________________________________
Added: svn:executable
   + *

Added: grass-addons/grass7/raster/r.edm.eval/r.edm.eval.html
===================================================================
--- grass-addons/grass7/raster/r.edm.eval/r.edm.eval.html	                        (rev 0)
+++ grass-addons/grass7/raster/r.edm.eval/r.edm.eval.html	2013-03-16 10:29:21 UTC (rev 55398)
@@ -0,0 +1,13 @@
+<h2>DESCRIPTION</h2>
+
+<em>r.edm.eval</em> Environmental distribution model evaluation script. This addon is a GRASS GIS wrapper for a R script. The script evaluates how well a modeled distribution map predicts an observed distribution map. The observed distribution map should be a binary map, with 1 for presence and 0 for absence. The modeled distribution normally contains probability scores between 0 and 1, but it will work on other scales too.
+
+<p>The function returns a table with the number of true positives, true negatives (TN), false positive (FP) and false negative (FN) values. It uses this values to compute the receiver operating characteristic (ROC) curve and optionally saves it as svg or png file. It furthermore computes a number of evaluation statistics, including the maximum kappa statistics and the spec.sens (maximum sum of the specificity and sensitivity values). For both the threshold values at which these values were found are provided as well as plotted in the ROC graph. 
+
+
+<h2>NOTES</h2>
+the scripts bins the EDM values. It does this by calculating the 1/200 quantiles. With very large outliers in your EDM values, this may give erreneous values (underestimates) of your statistics.
+
+<h2>AUTHOR</h2>
+Paulo van Breugel <paulo at ecodiv.org>
+



More information about the grass-commit mailing list