[QGIS-Developer] Processing 3.0: Possible change to the Singlepart to Multipart algorithm

Mon Jun 26 16:28:53 PDT 2017

Hi all,

As you may be aware, I've been working on rebuilding the backend of
Processing in c++ and refining how it operates.

As part of this I'd like to clear up the list of existing algorithms
and also refine how they behave. This list of "QGIS" algorithms has
grown organically during the 2.x cycle, and there's now numerous
oddities in the selection of available algorithms and their options.

This discussion relates to the "Singlepart to Multipart" algorithm. I'd like to:

1. drop the option for "unique ID field". This option is used to
'collect' the geometries from features with matching ID  together into
a single output geometry. I'd like to remove this option and make the
"Singlepart to Multipart" algorithm purely upgrade geometry types from
single part to multipart, without adding any extra parts or collecting
geometries. So basically the algorithm would upgrade single parts to
collections containing just a single part - the equivalent of PostGIS'
ST_Multi function.

2. Add a new algorithm "Collect parts" which does what the option in
Singlepart to Multipart used to do. This would collect all features
with matching fields (you could select more than one field - unlike
the current Singlepart to Multipart option) and output collection
features containing these geometries. Just like dissolve, but without
the dissolving of overlapping parts.

Ideally "collect parts" would be replaced with the more powerful
"aggregate" algorithm implemented in
https://github.com/qgis/QGIS/pull/4210, but that's a separate piece of
work.

The motivations here are:

1. simplicity of algorithms - keeping the Singlepart to Multipart as
just a straightforward 'make this geometry multipart type' algorithm,
without the extra complexity and code inefficiency which comes with
the current 'unique ID field' support. (Complying with the goal of
keeping algorithms modular and focused on one particular task)

2. unlocking future performance gains. Ideally we want algorithms to
operate feature-by-feature whenever this is possible. This will allow
us (in some future piece of work) to implement feature pipes where
features pass through models having operations performed to each
feature in a chain (instead of the current approach of multiple
temporary output layers). This isn't planned work (yet), but making
this algorithm operate feature-by-feature now, while we have the
luxury of API and model break, will allow it to use these
optimisations in a future 3.x release. The "unique ID field" support
prevents this algorithm from operating feature-by-feature.

Thoughts?

Nyall