[GRASS-user] v.class.mlR Error

Moritz Lennert mlennert at club.worldonline.be
Mon Jun 11 01:47:01 PDT 2018


Hi Jamille,

Le Fri, 8 Jun 2018 16:14:45 -0300,
Jamille Haarloo <j.r.haarloo at gmail.com> a écrit :

> Hello Moritz,
> 
> This time I asked a vector to be created with the stats and used this
> to extract training polygons in QGIS and imported the training map in
> GRASS. I had to do some interventions regarding the column names to
> make sure they are the same except for the class.
> I still get an error, and the only thing I could trace is the fact
> that values are missing in some rows  for both vectors. I am not sure
> if I should correct this/ retry it all.

I haven't seen this before, so yes, please try to eliminate the rows
with missing values. How did you get the feature variables and how come
there are missing values ?

I don't have the time to test this right now, so I prefer not to commit
as is, but you could try to edit your copy of v.class.mlR to add the
four lines marked with a plus:

     r_file.write('features <- read.csv("%s", sep="%s", header=TRUE,
     row.names=1)' % (feature_vars, separator)) r_file.write("\n")
+    r_file.write("features <- na.omit(features)")
+    r_file.write("\n")
     r_file.write('training <- read.csv("%s", sep="%s", header=TRUE,
     row.names=1)' % (training_vars, separator)) r_file.write("\n")
+    r_file.write("training <- na.omit(training)")
+    r_file.write("\n")

This would eliminate all lines that have at least one missing value.

Another option would be for you to send me the data (segments
and training) privately, so that I can test.

Moritz


> 
> This is the command output:
> 
> (Fri Jun 08 15:48:28 2018)
> 
> v.class.mlR -i --overwrite
> segments_map=Segments_vector_Stats_Ben_test at haarlooj_Ben_Test
> training_map=Training_Ben5 at haarlooj_Ben_Test
> raster_segments_map=best5_myregion1_at_haarlooj_Ben_Test_rank1 at haarlooj_Ben_Test
> train_class_column=Ecosystem output_class_column=vote
> output_prob_column=prob classifiers=svmRadial,rf,C5.0 folds=5
> partitions=10 tunelength=10 weighting_modes=smv,qbwwv
> weighting_metric=accuracy
> classification_results=C:\Users\haarlooj\Documents\CELOS\v.class.mIRR_optional_output\Ben_test_Classifier-results
> accuracy_file=C:\Users\haarlooj\Documents\CELOS\v.class.mIRR_optional_output\Ben_test_Classifier-accuracy
> model_details=C:\Users\haarlooj\Documents\CELOS\v.class.mIRR_optional_output\Ben_test_Classifier-module-runs
> bw_plot_file=C:\Users\haarlooj\Documents\CELOS\v.class.mIRR_optional_output\Ben_test_Classifier-performance
> r_script_file=C:\Users\haarlooj\Documents\CELOS\v.class.mIRR_optional_output\Ben_test_R_script
> processes=3 Running R now. Following output is R output.
> During startup - Warning messages:
> 1: Setting LC_CTYPE=en_US.cp1252 failed
> 2: Setting LC_COLLATE=en_US.cp1252 failed
> 3: Setting LC_TIME=en_US.cp1252 failed
> 4: Setting LC_MONETARY=en_US.cp1252 failed
> Loading required package: caret
> Loading required package: lattice
> Loading required package: ggplot2
> Loading required package: foreach
> Loading required package: iterators
> Loading required package: parallel
> During startup - Warning messages:
> 1: Setting LC_CTYPE=en_US.cp1252 failed
> 2: Setting LC_COLLATE=en_US.cp1252 failed
> 3: Setting LC_TIME=en_US.cp1252 failed
> 4: Setting LC_MONETARY=en_US.cp1252 failed
> During startup - Warning messages:
> 1: Setting LC_CTYPE=en_US.cp1252 failed
> 2: Setting LC_COLLATE=en_US.cp1252 failed
> 3: Setting LC_TIME=en_US.cp1252 failed
> 4: Setting LC_MONETARY=en_US.cp1252 failed
> During startup - Warning messages:
> 1: Setting LC_CTYPE=en_US.cp1252 failed
> 2: Setting LC_COLLATE=en_US.cp1252 failed
> 3: Setting LC_TIME=en_US.cp1252 failed
> 4: Setting LC_MONETARY=en_US.cp1252 failed
> Warning message:
> In nominalTrainWorkflow(x = x, y = y, wts = weights, info =
> trainInfo,  : There were missing values in resampled performance
> measures. Error in `$<-.data.frame`(`*tmp*`, vote_qbwwv, value =
> numeric(0)) : replacement has 0 rows, data has 1965
> Calls: $<- -> $<-.data.frame
> Execution halted
> ERROR: There was an error in the execution of the R script.
> Please check the R output.
> (Fri Jun 08 15:49:32 2018) Command finished (1 min 4 sec)
> 
> 
> 
> Best,
> Jamille
> 
> 
> 
> 
> On Thu, Jun 7, 2018 at 11:09 AM, Jamille Haarloo
> <j.r.haarloo at gmail.com> wrote:
> 
> > Hello Moritz,
> >
> > No worries. Thankful these modules are made available for newbies
> > in RS like me and also happy these interactions are possible for
> > learning. Hope to get back soon after some adjustments.
> >
> > Best,
> > Jamille
> >
> > On Thu, Jun 7, 2018 at 10:44 AM, Moritz Lennert <  
> > mlennert at club.worldonline.be> wrote:  
> >  
> >> Thanks
> >>
> >> On 07/06/18 15:17, Jamille Haarloo wrote:
> >>  
> >>> The first 20+ lines of Stats_Training_Ben_test:
> >>>
> >>> cat,area,perimeter,compact_circle,compact_square,fd,WV_Benat
> >>> imofo_1_min,WV_Benatimofo_1_max,WV_Benatimofo_1_range,WV_Ben
> >>> atimofo_1_mean,WV_Benatimofo_1_stddev,WV_Benatimofo_1_varia
> >>> nce,WV_Benatimofo_1_coeff_var,WV_Benatimofo_1_sum,WV_
> >>> Benatimofo_1_first_quart,WV_Benatimofo_1_median,WV_Benatim
> >>> ofo_1_third_quart,WV_Benatimofo_2_min,WV_Benatimofo_2_max,
> >>> WV_Benatimofo_2_range,WV_Benatimofo_2_mean,WV_Benatimofo_2_
> >>> stddev,WV_Benatimofo_2_variance,WV_Benatimofo_2_coeff_var,
> >>> WV_Benatimofo_2_sum,WV_Benatimofo_2_first_quart,WV_
> >>> Benatimofo_2_median,WV_Benatimofo_2_third_quart,WV_Benatimof
> >>> o_3_min,WV_Benatimofo_3_max,WV_Benatimofo_3_range,WV_Benat
> >>> imofo_3_mean,WV_Benatimofo_3_stddev,WV_Benatimofo_3_varianc
> >>> e,WV_Benatimofo_3_coeff_var,WV_Benatimofo_3_sum,WV_
> >>> Benatimofo_3_first_quart,WV_Benatimofo_3_median,WV_Benatim
> >>> ofo_3_third_quart,WV_Benatimofo_4_min,WV_Benatimofo_4_max,
> >>> WV_Benatimofo_4_range,WV_Benatimofo_4_mean,WV_Benatimofo_4_
> >>> stddev,WV_Benatimofo_4_variance,WV_Benatimofo_4_coeff_var,
> >>> WV_Benatimofo_4_sum,WV_Benatimofo_4_first_quart,WV_
> >>> Benatimofo_4_median,WV_Benatimofo_4_third_quart
> >>> 1144,3832.000000,1256.000000,5.723635,0.197144,1.729624,13,7
> >>> 6,63,46.4097077244259,9.98454911351384,99.69122100017,21.513
> >>> 9237092391,177842,40,47,53,40,138,98,90.2687891440501,15.250
> >>> 0825418009,232.565017531741,16.8940812061464,345910,81,92,
> >>> 100,15,61,46,40.8582985386221,7.82663897784868,61.2562776895
> >>> 802,19.1555675536767,156569,36,42,47,28,124,96,68.42536534
> >>> 44676,13.5774536655369,184.347248039801,19.8427200164517,262206,59,68,77
> >>> 1145,12092.000000,2282.000000,5.854120,0.192750,1.645226,13,
> >>> 94,81,51.386288455177,10.5294376761475,110.869057775874,20.4
> >>> 907534532914,621363,45,52,59,21,220,199,114.230731061859,23.
> >>> 3590328249442,545.644414516822,20.4489917973953,1381278,101,
> >>> 114,128,7,76,69,46.4219318557724,8.42747122371732,71.
> >>> 0222712265835,18.1540726264915,561334,42,48,52,17,198,181,
> >>> 97.2732385047966,22.492313569247,505.904169697333,23.
> >>> 1228176577445,1176228,84,97,110
> >>>
> >>> [...]  
> >>
> >> ---------------------  
> >>> All the lines of the output of v.db.select
> >>> Training_Ben2 at haarlooj_Ben_Tes t:
> >>>
> >>> cat|id|Type|code
> >>> 1|4|B29|18
> >>> 2|5|B31|19
> >>> 3|3|B28|17
> >>>  
> >>
> >>
> >> Again a lack of clear documentation on my side: both the training
> >> and the segment info should contain the same attributes, with only
> >> additional one column ('code' in your case) present in the
> >> training data.
> >>
> >> It should be possible to do this differently, i.e. provide the
> >> module with the features of all segments, and only the id/cat of
> >> each training segment with the relevant class and have the module
> >> merge the two, but this is not implemented, yet.
> >>
> >> I also just notice that you have the word 'Training' in both names.
> >>
> >> The segment_file/segment_map contains the info (cat + all feature
> >> variables) of all segments you wish to classify, either in the
> >> form of a csv file or in the form of a vector map with the info in
> >> the attribute table.
> >>
> >> The training_file/training_map contains the info (cat + all feature
> >> variables + class) of the training data. Often this is an extract
> >> of the former, but not necessarily.
> >>
> >> All columns in the training file have to be present in the segment
> >> file, except for the class column (your 'code').
> >>
> >> Sorry for the lack of docs. This module has mostly been used
> >> internally here and so we are not always aware of the unclear and
> >> missing parts. Having your feedback has been very useful !
> >>
> >> Moritz
> >>
> >>  
> >  



More information about the grass-user mailing list