[GRASS-dev] [QGIS-Developer] Running grass algorithms in threads

Stefan Blumentrath Stefan.Blumentrath at nina.no
Tue Aug 14 23:16:00 PDT 2018


Dear Rudi, Nyall,

GRASS is being used on HPC systems for heavily parallelisation. So, in principle, the answer is yes, you can for sure run GRASS algorithms in parallel.
On Linux, I often run several commands in parallel using xargs. So it works just fine in many cases. GRASS also has some specific python functions for parallel processing. See also:
https://grasswiki.osgeo.org/wiki/Parallel_GRASS_jobs
https://grasswiki.osgeo.org/wiki/Parallelizing_Scripts

However, if GRASS algorithms can be run in parallel in this particular case depends.

E.g., if the algorithm in question temporarily modifies the computational region, parallel processes can get in the way for each other.
Also, with SQLite as DB backend writing several vector maps (and attribute tables) in parallel will be a problem (due to SQLite locks).

In addition, if GRASS commands can be executed in parallel in the QGIS Processing framework is probably yet another question, depending on how e.g. QGIS handles data management (locations and mapsets) esp. in more complex workflows / models...

CCing also grass-dev list for more qualified answers...

Cheers
Stefan


-----Original Message-----
From: QGIS-Developer <qgis-developer-bounces at lists.osgeo.org> On Behalf Of Nyall Dawson
Sent: onsdag 15. august 2018 01:10
To: Rudi von Staden <rudivs at gmail.com>
Cc: qgis-developer <qgis-developer at lists.osgeo.org>
Subject: Re: [QGIS-Developer] Running grass algorithms in threads

On Tue, 14 Aug 2018 at 21:43, Rudi von Staden <rudivs at gmail.com> wrote:
>
> Hi all,
>
> The bottleneck in my script at the moment is the calculation of zonal stats using 'grass7:r.stats.zonal'. I thought I might speed things up by using QgsTask.fromFunction() or QgsProcessingAlgRunnerTask() to run these calculations in parallel. In my tests of both approaches the tasks seem to complete (task.status() == QgsTask.Complete), but the output file is only generated for 1 of 4 parallel tasks (the task that finishes first).
>
> I'm assuming this is because grass algorithms are not thread safe? Or am I missing something in my implementation that could make this work?

I strongly suspect that grass algorithms cannot be run in parallel.
This is why they cannot run in the background in QGIS like the native/GDAL algorithms can. But I'd love for confirmation about this and whether there's any way to make GRASS multi-thread safe.

Because this is grass related (and not QGIS specific) I'd suggest asking on the grass mailing list, and relaying any responses back here.

Nyall

>
> Thanks,
> Rudi
>
>
>
> My code for the QgsTask approach is as below:
>
> def getZonal(task, habitatModelFile, cover):
>     tempFile = QgsProcessingUtils.generateTempFilename("output.tif")
>     processing.run("grass7:r.stats.zonal", {
>         'base':habitatModelFile,
>         'cover':cover,
>         'method':5,
>         '-c':False,
>         '-r':False,
>         'output':tempFile,
>         'GRASS_REGION_PARAMETER':None,
>         'GRASS_REGION_CELLSIZE_PARAMETER':0,
>         'GRASS_RASTER_FORMAT_OPT':'',
>         
> 'GRASS_RASTER_FORMAT_META':''},context=context,feedback=algFeedback)
>
>     if task.isCanceled():
>         deleteFile(tempFile)
>         return
>
>     return tempFile
>
> ls90Task = QgsTask.fromFunction('LS90', getZonal, 
> habitatModelFile=hm1, cover=ls90Layer)
> QgsApplication.taskManager().addTask(ls90Task)
> feedback.pushInfo("Calculating LS14 mean...") ls14Task = 
> QgsTask.fromFunction('LS14 ', getZonal, habitatModelFile=hm2, 
> cover=ls14Layer)
> QgsApplication.taskManager().addTask(ls14Task)
> hs90Task = QgsTask.fromFunction('HS90 ', getZonal, 
> habitatModelFile=hm3, cover=hs90Layer)
> QgsApplication.taskManager().addTask(hs90Task)
> hs14Task = QgsTask.fromFunction('HS14 ', getZonal, 
> habitatModelFile=hm4, cover=hs14Layer)
> QgsApplication.taskManager().addTask(hs14Task)
>
> while (len([t for t in [ls90Task.status(), ls14Task.status(), hs90Task.status(),
>             hs14Task.status()] if t in [QgsTask.Running, QgsTask.Queued]]) > 0)
>             and not feedback.isCanceled():
>     sleep(1)
>
> if feedback.isCanceled():
>     # some cleanup code (send task.cancel() and wait for tasks to terminate)
>     break
>
> ls90Result = ls90Task.returned_values
> ls14Result = ls14Task.returned_values
> hs90Result = hs90Task.returned_values   # only this file exists
> hs14Result = hs14Task.returned_values
>
>
> _______________________________________________
> QGIS-Developer mailing list
> QGIS-Developer at lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
_______________________________________________
QGIS-Developer mailing list
QGIS-Developer at lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer


More information about the grass-dev mailing list