[QGIS-Developer] Thoughts on using processing models for ETL tasks

Nyall Dawson nyall.dawson at gmail.com
Mon May 28 16:44:02 PDT 2018


Hi Magnus,

Thanks for raising this discussion! I'm also passionate about seeing
QGIS become a first-class ETL tool, and I think that with 3.2 we most
of the underlying building blocks in place. What's needed now is a
bunch of GUI improvements and extra model interaction features (such
as the changes you've described).

I've recently filed for a QGIS funding grant to improve the processing
and modeler GUI, but this is mostly focused on stablising what we have
(as opposed to adding new features), and strengthening it all with
good unit test coverage. (Read more here:
https://docs.google.com/document/d/1puMt5blo1FrIyFRcQDPDAvcOFOPAkxYH17q1OliGs70/edit#heading=h.z5op4p92vp9l
). Unfortunately right now the most fragile parts of processing are
the GUI components (and the external GRASS and SAGA providers -- but
there's work underway to improve that).

Anyway, I'm quite invested in processing and seeing the modeler
improved, so I'd love to be involved in these
discussions/plans/development! Specific comments inline (and links to
existing tickets) below:

On 28 May 2018 at 20:11, johnrobot <johnrobot at gmail.com> wrote:

> - I would like to be able to resize resize model components, for example to
> be able to see the full name of a component.

https://issues.qgis.org/issues/16279

> - Add support for copy/paste of components. If I have spent a few minutes or
> more on configuring a algorithm and need an additional copy of it in the
> model, copy/paste of the component would save me time.

https://issues.qgis.org/issues/5479

> - Add support for selecting multiple model components by drawing a box.
> Also, Ctrl + A should select all components.

No existing ticket - please file one

> - Improved support for undo/redo.

https://issues.qgis.org/issues/5471

> - Add support for grouping components. Different parts of the model might
> might have different focus, such as transforming geometries, fixing
> attributes or doing calculations. I think it could help usability if I could
> group (in a visual box?) or colour the components accordingly.

No existing ticket - please file one

> - In the tool "Fix geometries", I canĀ“t specify the type of errors to look
> for and fix. That would help me understand the errors better.

What's the specific use case here? There's already a "check validity"
algorithm which exports results including descriptions of the errors
encountered.

> - I would like be able to add a new algorithm to the model without having to
> connect it to the other components directly. I would like to be able to add
> an algorith for later consideration.

Agreed -- I'd also like to see all algorithm parameter validation
deferred, so you could add an algorithm and not enter correct
parameter values up front. This would need to be paired with a "check
model" action - which would scan all the child algorithms and
highlight algorithms which have missing or invalid inputs.

> - Continuing on the previous suggestion, I would like to be able to connect
> an unconnected algorithm (already added to the model) using drag and drop.

https://issues.qgis.org/issues/13500

> - I would like to see the number of features flowing through each
> connection. That would help me understand the flow and potential errors
> better.

https://issues.qgis.org/issues/5447

> - When creating a new model, I often need to run a model quite a few times
> to test it. Each time, I have to select input layers manually when starting
> the model. Can I make QGIS remember the files I picked the last time?

Good idea. Now that you mention it I've also suffered with this in the
past, but just never realised it! Now that you've pointed it out, it's
going to become a mega-annoyance.... thanks.

> - If possible, I would like to see OpenCL/CUDA support for algorithms. I
> expect some of my tasks to be rather heavy (hours to days) and using the GPU
> could potentially speed up the processing a lot.

There was also a grant application filed for this:
https://docs.google.com/document/d/1puMt5blo1FrIyFRcQDPDAvcOFOPAkxYH17q1OliGs70/edit#heading=h.1do4j282c1w1

I'd be interested to know which algorithms you expect would benefit
from this though.

> - For some jobs, I want to zip the model output directly in the model. In
> these cases, I might also want to add external files, such as PDF
> documentation, to the zip archive. Support for that would be very useful in
> corporate environments such as ours.

Great idea. A "compress files" algorithm would be very useful! File a
ticket please.

If I can add some more to this wishlist:

- Add comments to a model: https://issues.qgis.org/issues/14518 .
(Actually I already coded this at
https://github.com/nyalldawson/QGIS/commit/f20a71797, just have never
got the chance to finalize it and add some tests and push to master.
Hopefully for 3.4!)

- Allow reorganizing inputs in models: Currently the model inputs
cannot be manually ordered, which results in quasi-random orders. This
obviously is not ideal, especially when you get linked inputs
appearing in the wrong order (such as a field choice appearing before
the layer choice it's linked to!)

- There's also https://github.com/qgis/QGIS-Enhancement-Proposals/issues/84,
which is about giving more choices for algorithm parameter inputs in
models (such as expression values evaluated just before that algorithm
is executed). All the backend code for this is hooked up, it's just
missing the GUI to configure.

Nyall


More information about the QGIS-Developer mailing list