[QGIS-Developer] Multiprocessing QGIS

Sebastian M. Ernst ernst at pleiszenburg.de
Tue Mar 30 11:10:37 PDT 2021


Hi all,

Am 29.03.21 um 01:17 schrieb Nyall Dawson:
> [...] So it's quite straightforward to use
> multiprocessing in PyQGIS via the Qt methods and do things like
> calculating intersections for different objects across multiple
> threads at once without having to worry about the GIL at all...

I have just experimented a little bit with `multiprocessing` inside QGIS
(the GUI app) across multiple platforms. All performed tests are common
text-book examples for the basic use of `multiprocessing` (completely
ignoring QGIS' own internal mechanisms for similar-ish tasks). Tests 1
to 3 are process-based, i.e. they `fork` on Unix-like systems. Tests 4
and 5 are thread-based. All 5 tests work on Unix-like systems *and*
Windows inside a regular Python interpreter (without the involvement of
QGIS). If someone wants to replicate my experiments, here is the code:

https://gist.github.com/s-m-e/65197151fd3ddc74eee319252b4a9d6e

All 5 tests work flawlessly in QGIS 3.14.0 on Linux. I was lacking the
time to go through younger versions but I'd expect no difference.

While tests 4 and 5 do work, tests 1 to 3, the process-based ones, fail
in QGIS 3.18.1 (202f1bf7e5) on Windows 10 (OSGeo4W). Actually the way
they are failing is interesting. The failure happens even at an earlier
stage than I expected:

`AttributeError: module 'sys' has no attribute 'argv'`

It basically means that QGIS' C++ code does not forward `argv` to the
Python interpreter. On Linux is does not matter, but on Windows it does:
`multiprocessing` prepares to start new worker processes from scratch
(because there is no `fork`) and tries to access the main process'
`argv`, so it can eventually forward it to the workers. The error
therefore happens before it is even attempted to start worker processes
which is where I was originally expecting a failure. I think the `argv`
issue is a trivial thing to fix. Right after the Python interpreter gets
initialized ...

https://github.com/qgis/QGIS/blob/70eac1c97255ea31e3e5d57fc7cea721d080c8e8/src/python/qgspythonutilsimpl.cpp#L158

... a call to `PySys_SetArgv` is required:

https://docs.python.org/3/c-api/init.html?highlight=pysys_setargv#c.PySys_SetArgv

Once it is fixed, the question then becomes what happens / fails next. I
do not have an operational Windows build-environment at the moment -
does someone who has one like to try this? I'd be really interested in
the result.

Bottom line: Simple process-based parallelism with `multiprocessing`
**on Windows**, which is actually very common in the Python world, has
definately been broken in QGIS for several versions.

Best regards,
Sebastian


More information about the QGIS-Developer mailing list