[GRASS-user] v.class.mlR Error in data.frame : arguments imply differing number of rows

Tue Apr 16 07:50:52 PDT 2019

Replacing
ids <- rownames(features)
with
ids <- rownames(predicted)
is the only edit I did after the previous try, so this should have solved
the error.

If I understood correctly na.action = na.exclude can help to work around
the NA values without deleting rows but somehow I did not work. The user
can always compare the original data rows to the results, right? As in my
case, comparing the results file to the vector file shows that 17849 of
17851 segments were classified as expected from the error. I did not loose
any original data and also saved a copy. If you mention in the manual that
intermediate na values might block the analyses and will therefore be
omitted from the final results, it should be perfectly fine.
I noticed a minor issue; not all results were added to the vector file -
about 686 segments (almost 4% of the data) were somehow missed. Fortunately
I do have the results as separate output file.

Best,
Jamille

On Tue, Apr 16, 2019 at 10:04 AM Moritz Lennert <
mlennert at club.worldonline.be> wrote:

> On 16/04/19 14:37, Jamille Haarloo wrote:
> > Hi Moritz,
> >
> > Thank you! it worked.
>
> What worked, exactly ? ;-)
>
>
> > I did not find the line nor similar lines of 'features <-
> > na.omit(features)' in the v.class.mlR script/ R_script4 file.
>
> Sorry, I am working with a heavily modified version here on my computer
> currently, and didn't realize that this was part of my local modifications.
>
> I also see that in the manual I actually wrote "The module makes no
> effort to check the input data for NA values or anything else that might
> perturb the analyses. It is up to the user to proceed to relevant checks
> before launching the module."
>
> I could add an na.omit to the code. What is your opinion on that as a
> user ? Isn't it too invasive to just force this on the user ? I do
> acknowledge that in my local case it is convenient.
>
> Moritz
>
> >
> > Best,
> > Jamille
> >
> > On Mon, Apr 15, 2019 at 11:09 AM Moritz Lennert
> > <mlennert at club.worldonline.be <mailto:mlennert at club.worldonline.be>>
> wrote:
> >
> >     Hi Jamille,
> >
> >     On 15/04/19 14:49, Jamille Haarloo wrote:> Dear Moritz and other
> Grass-
> >     users and developers,
> >       >
> >       > I tried dealing with the error myself by changing predicted <-
> >       > data.frame(predict(models.cv <http://models.cv>
> >     <http://models.cv>, features)) into
> >       > predicted <- data.frame(predict(models.cv <http://models.cv>
> >     <http://models.cv>, features,
> >       > na.action = na.exclude)), based on discussions online implying
> some
> >       > predictions might be invalid NaN values. I checked the script
> >     output to
> >       > see if this change was implemented and it was, but I get the
> >     same error.
> >       > Any suggestions what to try next?>
> >       > ------------------------------
> >       > v.class.mlR -i --overwrite segments_map=nvSegW24IDM4DV4 at LUP1
> >       > training_map=TrainingApril2019 at LUP1
> train_class_column=class_code
> >       > output_class_column=output_class output_prob_column=probability
> >       > classifiers=svmLinear,rf,xgbTree folds=5 partitions=10
> tunelength=10
> >       > weighting_modes=bwwv,qbwwv weighting_metric=accuracy
> >       >
> >
>  classification_results=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\results_all_classifiers
> >
> >
> >       >
> >
>  accuracy_file=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\accuracy_classifiers
> >
> >
> >       >
> >
>  model_details=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\details_classifier_module_runs
> >
> >
> >       >
> >
>  bw_plot_file=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\box-whicker_classifier_performance
> >
> >
> >       >
> >
>  r_script_file=C:\Users\haarlooj\Documents\CELOS\v.class.mlr_outputapril2019\R_script4
> >
> >
> >       > processes=3
> >     Normally, there should be no NA in the features as there is a line:
> >
> >     features <- na.omit(features)
> >
> >     early in the R script. Can you see it in the R_script4 file ?
> >
> >
> >       > Running R now. Following output is R output.
> >       > During startup - Warning messages:
> >       > 1: Setting LC_CTYPE=en_US.cp1252 failed
> >       > 2: Setting LC_COLLATE=en_US.cp1252 failed
> >       > 3: Setting LC_TIME=en_US.cp1252 failed
> >       > 4: Setting LC_MONETARY=en_US.cp1252 failed
> >       > Loading required package: caret
> >       > Loading required package: lattice
> >       > Loading required package: ggplot2
> >       > Warning messages:
> >       > 1: package 'caret' was built under R version 3.5.3
> >       > 2: package 'ggplot2' was built under R version 3.5.3
> >       > Loading required package: foreach
> >       > Loading required package: iterators
> >       > Loading required package: parallel
> >       > Warning messages:
> >       > 1: package 'doParallel' was built under R version 3.5.3
> >       > 2: package 'foreach' was built under R version 3.5.3
> >       > 3: package 'iterators' was built under R version 3.5.3
> >       > During startup - Warning messages:
> >       > 1: Setting LC_CTYPE=en_US.cp1252 failed
> >       > 2: Setting LC_COLLATE=en_US.cp1252 failed
> >       > 3: Setting LC_TIME=en_US.cp1252 failed
> >       > 4: Setting LC_MONETARY=en_US.cp1252 failed
> >       > During startup - Warning messages:
> >       > 1: Setting LC_CTYPE=en_US.cp1252 failed
> >       > 2: Setting LC_COLLATE=en_US.cp1252 failed
> >       > 3: Setting LC_TIME=en_US.cp1252 failed
> >       > 4: Setting LC_MONETARY=en_US.cp1252 failed
> >       > During startup - Warning messages:
> >       > 1: Setting LC_CTYPE=en_US.cp1252 failed
> >       > 2: Setting LC_COLLATE=en_US.cp1252 failed
> >       > 3: Setting LC_TIME=en_US.cp1252 failed
> >       > 4: Setting LC_MONETARY=en_US.cp1252 failed
> >       > Error in data.frame(id = rownames(features), predicted) :
> >       >    arguments imply differing number of rows: 17851, 17849
> >       > Execution halted
> >     IDs are taken from the features and for some reasons there are two
> >     features which do not have a prediction. It might help if you could
> >     find
> >     out why.
> >
> >     I cannot test right now, but you might want to check if you can
> replace
> >
> >     ids <- rownames(features)
> >
> >     with something like
> >
> >     ids <- rownames(predicted)
> >
> >     ?
> >
> >     Moritz
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-user/attachments/20190416/c2b9897b/attachment.html>