[GRASS-user] v.class.mlR Error in data.frame : arguments imply differing number of rows

Moritz Lennert mlennert at club.worldonline.be
Tue Apr 16 06:04:10 PDT 2019


On 16/04/19 14:37, Jamille Haarloo wrote:
> Hi Moritz,
> 
> Thank you! it worked.

What worked, exactly ? ;-)


> I did not find the line nor similar lines of 'features <- 
> na.omit(features)' in the v.class.mlR script/ R_script4 file.

Sorry, I am working with a heavily modified version here on my computer 
currently, and didn't realize that this was part of my local modifications.

I also see that in the manual I actually wrote "The module makes no 
effort to check the input data for NA values or anything else that might 
perturb the analyses. It is up to the user to proceed to relevant checks 
before launching the module."

I could add an na.omit to the code. What is your opinion on that as a 
user ? Isn't it too invasive to just force this on the user ? I do 
acknowledge that in my local case it is convenient.

Moritz

> 
> Best,
> Jamille
> 
> On Mon, Apr 15, 2019 at 11:09 AM Moritz Lennert 
> <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>> wrote:
> 
>     Hi Jamille,
> 
>     On 15/04/19 14:49, Jamille Haarloo wrote:> Dear Moritz and other Grass-
>     users and developers,
>       >
>       > I tried dealing with the error myself by changing predicted <-
>       > data.frame(predict(models.cv <http://models.cv>
>     <http://models.cv>, features)) into
>       > predicted <- data.frame(predict(models.cv <http://models.cv>
>     <http://models.cv>, features,
>       > na.action = na.exclude)), based on discussions online implying some
>       > predictions might be invalid NaN values. I checked the script
>     output to
>       > see if this change was implemented and it was, but I get the
>     same error.
>       > Any suggestions what to try next?>
>       > ------------------------------
>       > v.class.mlR -i --overwrite segments_map=nvSegW24IDM4DV4 at LUP1
>       > training_map=TrainingApril2019 at LUP1 train_class_column=class_code
>       > output_class_column=output_class output_prob_column=probability
>       > classifiers=svmLinear,rf,xgbTree folds=5 partitions=10 tunelength=10
>       > weighting_modes=bwwv,qbwwv weighting_metric=accuracy
>       >
>     classification_results=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\results_all_classifiers
> 
> 
>       >
>     accuracy_file=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\accuracy_classifiers
> 
> 
>       >
>     model_details=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\details_classifier_module_runs
> 
> 
>       >
>     bw_plot_file=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\box-whicker_classifier_performance
> 
> 
>       >
>     r_script_file=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\R_script4
> 
> 
>       > processes=3
>     Normally, there should be no NA in the features as there is a line:
> 
>     features <- na.omit(features)
> 
>     early in the R script. Can you see it in the R_script4 file ?
> 
> 
>       > Running R now. Following output is R output.
>       > During startup - Warning messages:
>       > 1: Setting LC_CTYPE=en_US.cp1252 failed
>       > 2: Setting LC_COLLATE=en_US.cp1252 failed
>       > 3: Setting LC_TIME=en_US.cp1252 failed
>       > 4: Setting LC_MONETARY=en_US.cp1252 failed
>       > Loading required package: caret
>       > Loading required package: lattice
>       > Loading required package: ggplot2
>       > Warning messages:
>       > 1: package 'caret' was built under R version 3.5.3
>       > 2: package 'ggplot2' was built under R version 3.5.3
>       > Loading required package: foreach
>       > Loading required package: iterators
>       > Loading required package: parallel
>       > Warning messages:
>       > 1: package 'doParallel' was built under R version 3.5.3
>       > 2: package 'foreach' was built under R version 3.5.3
>       > 3: package 'iterators' was built under R version 3.5.3
>       > During startup - Warning messages:
>       > 1: Setting LC_CTYPE=en_US.cp1252 failed
>       > 2: Setting LC_COLLATE=en_US.cp1252 failed
>       > 3: Setting LC_TIME=en_US.cp1252 failed
>       > 4: Setting LC_MONETARY=en_US.cp1252 failed
>       > During startup - Warning messages:
>       > 1: Setting LC_CTYPE=en_US.cp1252 failed
>       > 2: Setting LC_COLLATE=en_US.cp1252 failed
>       > 3: Setting LC_TIME=en_US.cp1252 failed
>       > 4: Setting LC_MONETARY=en_US.cp1252 failed
>       > During startup - Warning messages:
>       > 1: Setting LC_CTYPE=en_US.cp1252 failed
>       > 2: Setting LC_COLLATE=en_US.cp1252 failed
>       > 3: Setting LC_TIME=en_US.cp1252 failed
>       > 4: Setting LC_MONETARY=en_US.cp1252 failed
>       > Error in data.frame(id = rownames(features), predicted) :
>       >    arguments imply differing number of rows: 17851, 17849
>       > Execution halted
>     IDs are taken from the features and for some reasons there are two
>     features which do not have a prediction. It might help if you could
>     find
>     out why.
> 
>     I cannot test right now, but you might want to check if you can replace
> 
>     ids <- rownames(features)
> 
>     with something like
> 
>     ids <- rownames(predicted)
> 
>     ?
> 
>     Moritz
> 




More information about the grass-user mailing list