[GRASS-user] v.class.mlR error

Moritz Lennert mlennert at club.worldonline.be
Fri Nov 11 03:08:04 PST 2016


On 11/11/16 11:36, James Duffy wrote:
>
>
> On 11 November 2016 at 10:07, Moritz Lennert
> <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>> wrote:
>
>
>
>     Le 11 novembre 2016 10:21:11 GMT+01:00, James Duffy
>     <james.philip.duffy at gmail.com <mailto:james.philip.duffy at gmail.com>>
>     a écrit :
>     >On 10 November 2016 at 19:37, Moritz Lennert
>     ><mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>>
>     >wrote:
>     >
>     >>
>     >>
>     >> Le 10 novembre 2016 15:45:59 GMT+01:00, James Duffy <
>     >> james.philip.duffy at gmail.com
>     <mailto:james.philip.duffy at gmail.com>> a écrit :
>     >> >Hello,
>     >> >
>     >> >I'm trying to run v.class.mlR on a vector map, with a separate
>     >vector
>     >> >map
>     >> >containing my training data. Currently I have two classes '1' and
>     >'2'
>     >> >stored in the column 'class'. My region is set to that of the
>     >segments
>     >> >to
>     >> >be classified. I run the following command:
>     >> >
>     >> >v.class.mlR segments_map=gp_seg_stats_vec at gp1 \
>     >> >training_map=gp_seg_sed_grass at gp1 train_class_column=class \
>     >> >output_class_column=vote output_prob_column=prob folds=5 \
>     >> >partitions=10 tunelength=10 weighting_metric=accuracy
>     >> >
>     >> >And get the following output:
>     >> >
>     >> >Running R now. Following output is R output.
>     >> >Loading required package: caret
>     >> >Loading required package: lattice
>     >> >Loading required package: ggplot2
>     >> >Loading required package: kernlab
>     >> >
>     >> >Attaching package: ‘kernlab’
>     >> >
>     >> >The following object is masked from ‘package:ggplot2’:
>     >> >
>     >> >    alpha
>     >> >
>     >> >Loading required package: randomForest
>     >> >randomForest 4.6-12
>     >> >Type rfNews() to see new features/changes/bug fixes.
>     >> >
>     >> >Attaching package: ‘randomForest’
>     >> >
>     >> >The following object is masked from ‘package:ggplot2’:
>     >> >
>     >> >    margin
>     >> >
>     >> >Loading required package: rpart
>     >> >Error in eval(expr, envir, enclos) : object 'cat_' not found
>     >> >Calls: data.frame ... predict.train -> model.frame ->
>     >> >model.frame.default
>     >> >-> eval -> eval
>     >> >Execution halted
>     >> >ERROR: There was an error in the execution of the R script.
>     >> >       Please check the R output.
>     >> >
>     >> >
>     >> >I'm not entirely sure where it's looking for anything called 'cat_'.
>     >> >
>     >> >Any help much appreciated please.
>     >>
>     >>
>     >> Could you send us the output of v.info <http://v.info> -c for
>     both of the input maps
>     >?
>     >>
>     >
>     >v.info <http://v.info> -c --verbose map=gp_seg_stats_vec at gp1
>     >
>     >INTEGER|cat
>     >DOUBLE PRECISION|area
>     >DOUBLE PRECISION|perimeter
>     >DOUBLE PRECISION|fd
>     >DOUBLE PRECISION|gpo1min
>     >DOUBLE PRECISION|com_circ
>     >DOUBLE PRECISION|gpo1max
>     >DOUBLE PRECISION|gpo1range
>     >DOUBLE PRECISION|gpo1mean
>     >DOUBLE PRECISION|gpo1stdev
>     >DOUBLE PRECISION|gpo1var
>     >DOUBLE PRECISION|gpo1sum
>     >DOUBLE PRECISION|gpo2min
>     >DOUBLE PRECISION|gpo2max
>     >DOUBLE PRECISION|gpo2range
>     >DOUBLE PRECISION|gpo2mean
>     >DOUBLE PRECISION|gpo2stdev
>     >DOUBLE PRECISION|gpo2var
>     >DOUBLE PRECISION|gpo3min
>     >DOUBLE PRECISION|gpo3max
>     >DOUBLE PRECISION|gpo3range
>     >DOUBLE PRECISION|gpo3mean
>     >DOUBLE PRECISION|gpo3stdev
>     >DOUBLE PRECISION|gpo3var
>     >DOUBLE PRECISION|gpo3sum
>     >DOUBLE PRECISION|gpo4min
>     >DOUBLE PRECISION|gpo4max
>     >DOUBLE PRECISION|gpo4range
>     >DOUBLE PRECISION|gpo4mean
>     >DOUBLE PRECISION|gpo4stdev
>     >DOUBLE PRECISION|gpo4var
>     >DOUBLE PRECISION|gpo4sum
>     >Displaying column types/names for database connection of layer <1>:
>     >(Fri Nov 11 09:17:34 2016) Command finished (0 sec)
>     >
>     >v.info <http://v.info> -c --verbose map=gp_seg_sed_grass at gp1
>     >
>     >INTEGER|cat
>     >INTEGER|cat_
>
>     This is the problem. cat_ using the training data, but not in the
>     segments file. Erasing this column from the training data should be
>     enough to solve the problem.
>
>
> Ok, i'm pretty sure that the script made that as the shapefile i've read
> into GRASS didn't have 'cat_' has a column in the attribute table.
>
> I have deleted it, and now get the following when trying to run the
> v.class.mlR code I posted above:
>
> Loading required package: rpart
> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev
> = object$xlevels) :
>   invalid type (closure) for variable 'type'
> Calls: data.frame ... predict.train -> model.frame -> model.frame.default
> Execution halted
> ERROR: There was an error in the execution of the R script.
>        Please check the R output.
>
>
> 'type' is a character column which just has one word descriptions of my
> cover types in it. Does the code not deal well with non-numeric attributes?


Probably not. I'll have to check into this, but the main issue actually 
is that the module currently assumes that all columns other than cat and 
class are used as independent variables for the model. IIUC, your 'type' 
variable is just a label for your class, or ? Then this should not be 
part of your training data either.

As a todo for the module would be to add a parameter that allows to list 
the columns to use for modeling, but currently it just uses all columns 
in the file.

Moritz


More information about the grass-user mailing list