[GRASS-user] v.class.mlR error
James Duffy
james.philip.duffy at gmail.com
Fri Nov 11 08:53:43 PST 2016
On 11 Nov 2016 11:08 am, "Moritz Lennert" <mlennert at club.worldonline.be>
wrote:
>
> On 11/11/16 11:36, James Duffy wrote:
>>
>>
>>
>> On 11 November 2016 at 10:07, Moritz Lennert
>> <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>>
wrote:
>>
>>
>>
>> Le 11 novembre 2016 10:21:11 GMT+01:00, James Duffy
>> <james.philip.duffy at gmail.com <mailto:james.philip.duffy at gmail.com>>
>>
>> a écrit :
>> >On 10 November 2016 at 19:37, Moritz Lennert
>> ><mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>>
>>
>> >wrote:
>> >
>> >>
>> >>
>> >> Le 10 novembre 2016 15:45:59 GMT+01:00, James Duffy <
>> >> james.philip.duffy at gmail.com
>> <mailto:james.philip.duffy at gmail.com>> a écrit :
>>
>> >> >Hello,
>> >> >
>> >> >I'm trying to run v.class.mlR on a vector map, with a separate
>> >vector
>> >> >map
>> >> >containing my training data. Currently I have two classes '1' and
>> >'2'
>> >> >stored in the column 'class'. My region is set to that of the
>> >segments
>> >> >to
>> >> >be classified. I run the following command:
>> >> >
>> >> >v.class.mlR segments_map=gp_seg_stats_vec at gp1 \
>> >> >training_map=gp_seg_sed_grass at gp1 train_class_column=class \
>> >> >output_class_column=vote output_prob_column=prob folds=5 \
>> >> >partitions=10 tunelength=10 weighting_metric=accuracy
>> >> >
>> >> >And get the following output:
>> >> >
>> >> >Running R now. Following output is R output.
>> >> >Loading required package: caret
>> >> >Loading required package: lattice
>> >> >Loading required package: ggplot2
>> >> >Loading required package: kernlab
>> >> >
>> >> >Attaching package: ‘kernlab’
>> >> >
>> >> >The following object is masked from ‘package:ggplot2’:
>> >> >
>> >> > alpha
>> >> >
>> >> >Loading required package: randomForest
>> >> >randomForest 4.6-12
>> >> >Type rfNews() to see new features/changes/bug fixes.
>> >> >
>> >> >Attaching package: ‘randomForest’
>> >> >
>> >> >The following object is masked from ‘package:ggplot2’:
>> >> >
>> >> > margin
>> >> >
>> >> >Loading required package: rpart
>> >> >Error in eval(expr, envir, enclos) : object 'cat_' not found
>> >> >Calls: data.frame ... predict.train -> model.frame ->
>> >> >model.frame.default
>> >> >-> eval -> eval
>> >> >Execution halted
>> >> >ERROR: There was an error in the execution of the R script.
>> >> > Please check the R output.
>> >> >
>> >> >
>> >> >I'm not entirely sure where it's looking for anything called
'cat_'.
>> >> >
>> >> >Any help much appreciated please.
>> >>
>> >>
>> >> Could you send us the output of v.info <http://v.info> -c for
>>
>> both of the input maps
>> >?
>> >>
>> >
>> >v.info <http://v.info> -c --verbose map=gp_seg_stats_vec at gp1
>>
>> >
>> >INTEGER|cat
>> >DOUBLE PRECISION|area
>> >DOUBLE PRECISION|perimeter
>> >DOUBLE PRECISION|fd
>> >DOUBLE PRECISION|gpo1min
>> >DOUBLE PRECISION|com_circ
>> >DOUBLE PRECISION|gpo1max
>> >DOUBLE PRECISION|gpo1range
>> >DOUBLE PRECISION|gpo1mean
>> >DOUBLE PRECISION|gpo1stdev
>> >DOUBLE PRECISION|gpo1var
>> >DOUBLE PRECISION|gpo1sum
>> >DOUBLE PRECISION|gpo2min
>> >DOUBLE PRECISION|gpo2max
>> >DOUBLE PRECISION|gpo2range
>> >DOUBLE PRECISION|gpo2mean
>> >DOUBLE PRECISION|gpo2stdev
>> >DOUBLE PRECISION|gpo2var
>> >DOUBLE PRECISION|gpo3min
>> >DOUBLE PRECISION|gpo3max
>> >DOUBLE PRECISION|gpo3range
>> >DOUBLE PRECISION|gpo3mean
>> >DOUBLE PRECISION|gpo3stdev
>> >DOUBLE PRECISION|gpo3var
>> >DOUBLE PRECISION|gpo3sum
>> >DOUBLE PRECISION|gpo4min
>> >DOUBLE PRECISION|gpo4max
>> >DOUBLE PRECISION|gpo4range
>> >DOUBLE PRECISION|gpo4mean
>> >DOUBLE PRECISION|gpo4stdev
>> >DOUBLE PRECISION|gpo4var
>> >DOUBLE PRECISION|gpo4sum
>> >Displaying column types/names for database connection of layer <1>:
>> >(Fri Nov 11 09:17:34 2016) Command finished (0 sec)
>> >
>> >v.info <http://v.info> -c --verbose map=gp_seg_sed_grass at gp1
>>
>> >
>> >INTEGER|cat
>> >INTEGER|cat_
>>
>> This is the problem. cat_ using the training data, but not in the
>> segments file. Erasing this column from the training data should be
>> enough to solve the problem.
>>
>>
>> Ok, i'm pretty sure that the script made that as the shapefile i've read
>> into GRASS didn't have 'cat_' has a column in the attribute table.
>>
>> I have deleted it, and now get the following when trying to run the
>> v.class.mlR code I posted above:
>>
>> Loading required package: rpart
>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev
>> = object$xlevels) :
>> invalid type (closure) for variable 'type'
>> Calls: data.frame ... predict.train -> model.frame -> model.frame.default
>> Execution halted
>> ERROR: There was an error in the execution of the R script.
>> Please check the R output.
>>
>>
>> 'type' is a character column which just has one word descriptions of my
>> cover types in it. Does the code not deal well with non-numeric
attributes?
>
>
>
> Probably not. I'll have to check into this, but the main issue actually
is that the module currently assumes that all columns other than cat and
class are used as independent variables for the model. IIUC, your 'type'
variable is just a label for your class, or ? Then this should not be part
of your training data either.
>
> As a todo for the module would be to add a parameter that allows to list
the columns to use for modeling, but currently it just uses all columns in
the file.
>
Thanks for the suggestion. Removing that column worked. Could I suggest
adding in a check either pre reading into R or pre-analysis in R to drop
any character columns from the modelling but keep them in the attribute
table?
James
> Moritz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20161111/aa686017/attachment-0001.html>
More information about the grass-user
mailing list