[Qgis-developer] Processing NG/V2 - brainstorming

Mon Dec 5 09:41:07 PST 2016

I really like the idea of having basic Processing code in the core library
with Python bindings available.

Regarding iterators approach... It sounds interesting, but we need to think
also about possibility to work with layers, which are not loaded in QGIS.
In this case as I understand we need to construct layer and then create
an iterator for Processing from it.

Another idea (obvious and already discussed a bit) is to adopt Processing
to use recently added Task Manager, so algs can be executed in background
and models can be executed using multiple threads when this is possible.

2016-12-05 11:03 GMT+02:00 Nyall Dawson <nyall.dawson at gmail.com>:
> Hi all,
>
> I've recently been informally chatting about possible enhancements to
> processing with a few QGIS team members, and I thought it'd be worth
> starting a public brainstorm about these ideas.
>
> This is really just "thinking aloud" about what the next logical steps
> are for processing and how we can make it more competitive against
> programs like FME.
>
> So here we go... a bunch of random ideas on future processing enhancements:
>
>
> 1. Rework native algorithms to avoid layer input/outputs
>
> (Full credit goes to Matthias here). One current inefficiency with
> processing models is that every step is exported to a file based
> format, which is then reread in for the next algorithm. This means
> that a simple set of steps like buffer->reproject involves multiple
> conversions from OGR formats to QgsFeature/QgsGeometry and back to the
> OGR output format, when it could be simplified to just two operations
> on a QgsFeature's geometry and then a final write to disk. In addition
> to the inefficiency here we also lose things like long field names and
> full support for z/m/curves (depending on the intermediate file
> format).
>
> So... how to address this... Matthias came up with the idea that
> native processing algs could accept a feature iterator instead of an
> input layer, and themselves be a feature iterator. This effectively
> would make a processing model a chain of iterators which features are
> "pulled" through from a final writer step. Ie, the source
> layer->buffer->save to layer->load layer->reproject->save to layer
> model becomes:
>
> a writer
> -> which reads features from the iterator provided a transform alg
> -> which reprojects the geometry from features provided by a buffer
> alg's output iterator
> -> which buffers the geometry on features from an iterator from the
> original source layer
>
> (Obviously, anytime a non-native algorithm (eg saga/grass/ogr) is used
> then the features would need to be written to disk first. But this is
> no different to the current behaviour so there shouldn't be any extra
> cost incurred.)
>
> This gets a little trickier when we want to multithread something, eg.
> 2 input layers-> each buffered -> intersection of the two. But we
> could handle this by using a form of "pipe" iterator, which sucks in
> features as fast as possible from its input iterator and stores them
> in one thread, and then an algorithm in another thread consumes these
> features as they become available. Ie:
>
> thread a:
> input layer 1 iterator -> buffer 1 alg iterator -> "pipe" iterator a ->
>
>
>                                 thread c: intersection alg
> thread b:
> input layer 2 iterator -> buffer 2 alg iterator  -> "pipe" iterator b ->
>
> where thread c reads the features from "pipe iterator a" and "b" as
> they become available, and then does its processing on them.
>
> (hope that makes sense!)
>
> 2. Georeferenced geometries
>
> I think for the approach in 1 to work we'd also need to introduce the
> concept of "referenced" geometries. This would basically be
> QgsGeometry + a QgsCoordinateReferenceSystem. It would allow retrieval
> of a geometry's CRS without requiring any knowledge of its source
> layer (or where no layer exists, eg the canvas extent as a geometry).
>
> I've pondered several approaches to this, such as:
> - QgsReferencedFeature (QgsFeature + crs): This doesn't work for
> non-feature based geometries or allow features with multiple
> geometries in different CRSes. (See
> https://github.com/qgis/qgis3.0_api/issues/21).
> - QgsReferencedGeometry: subclass of QgsGeometry with a CRS member.
> This approach would avoid adding any extra overhead to QgsGeometry.
> But given that the main use of QgsGeometry (geometry attached to a
> feature from a layer) will always have a CRS associated, this seems
> like it unnecessarily complicates the API.
>
> So my current preference would be for QgsGeometry to gain a
> QgsCoordinateReferenceSystem member variable, which is an invalid crs
> if the geometry is not referenced. This should still be quite
> lightweight given that QgsCoordinateReferenceSystem is implicitly
> shared.
>
>
> 3. Porting components of processing to core
>
> There's demand (from eg QField) to reuse parts of processing outside of PyQGIS.
>
> I think good candidates for porting to core would be:
> - parameters
> - inputs
> - the algorithm base class
>
> In addition to allowing use outside of python this would also help
> strengthen these components by the static typing which would result of
> porting to c++.
>
> I'd also like to see the results + history dialogs merged and moved to
> core so that they can also be reused for non-processing tasks (eg
> composer exports).
>
>
>
> So there we go. What's everyone's thoughts? Are these ideas worth
> pursing? Is there other things we should be looking at investigation
> for future processing enhancements?
>
> Nyall
> _______________________________________________
> Qgis-developer mailing list
> Qgis-developer at lists.osgeo.org
> List info: http://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: http://lists.osgeo.org/mailman/listinfo/qgis-developer

-- 
Alexander Bruy