[Qgis-developer] QGIS Multi-threaded Rendering

Martin Dobias wonder.sk at gmail.com
Thu Dec 12 04:14:40 PST 2013


Hi everyone!

[attention: long text ahead]

In recent weeks I have been working on moving map rendering into
background worker threads and all related infrastructure changes.
There is still quite a lot of work to do, but finally I think it is a
time for a preview and a broader discussion about the whole thing. Not
every little QGIS feature is working yet, but things work fine with
most commonly used data sources (GDAL/OGR, PostGIS, SpatiaLite).
Please give it a try! The code is available in my QGIS repository on
GitHub, the branch is called threading-revival:
https://github.com/wonder-sk/QGIS/tree/threading-revival

The plan is to continue working on the project in the following weeks
to reintroduce support for features and data providers currently not
supported (e.g. WMS, WFS). Hopefully by the time of feature freeze in
late January the code will in condition to be merged to master, so the
multi-threaded rendering can appear in QGIS 2.2 release.

The project has already quite some history: it started as my GSoC
project in summer of 2010, unfortunately it was not merged back to
master branch because the code never get into production level
quality. The scope of the project is not just about moving rendering
into background: it is mostly about updating various pieces of QGIS
core library and data providers to behave correctly in the case that
more threads simultaneously try to access the same resource - until
now the assumption always was that there was only one thread. Back in
2010, QGIS code was much less ready to change those assumptions. Now,
after the release of 2.0, the code is much closer to what we need for
multi-threaded rendering: both vector and raster layer code went
through a major overhaul in the preparation for 2.0.

What to expect from the project:
1. better user experience. Browsing the map in canvas gets much
snappier - pan and zoom map smoothly with instant preview, without
having to wait until rendering of the previous view is finished,
without flickers or other annyoances. Even if the map takes longer to
render, you are free to do any actions in the meanwhile. It is a bit
hard to describe the difference of the overall feel, one needs to try
it out :)

2. faster rendering of projects with more layers. Finally, it is
possible to use the full power of your CPU. The rendering of map
layers can be done in parallel: layers will be rendered separately at
the same time and then composited together to form the final map
image. In theory, rendering of two layers can get twice as fast. The
speedup depends a lot on your data.

3. starting point for more asynchronous operations. With safe access
to map layers from worker threads, more user actions could be
processed in background without blocking GUI, e.g. opening of
attribute table, running analyses, layer identification or change of
selection.

What not to expect from the project:
- faster rendering of one individual layer. A project with one layer
that took five seconds to render will still take five seconds to
render. The parallelization happens at the level of map layers. With
one map layer QGIS will still use just one core. Optimizing the
rendering performance of an individual layer is outside of the scope
of this project.

What to expect from the project *right now*: things should generally
work, except for the following:
- data providers: delimited text, gpx, grass, mssql, sql anywhere, wfs, wms, wcs
- QGIS server
- point displacement renderer

For testing, simply use QGIS as you would usually do and see if you
feel a difference when browsing the map. In Options dialog, Rendering
tab, there are few new configuration options for you to play with: 1.
parallel or sequential rendering, 2. map update interval. The parallel
rendering may use all your CPU power, while sequential (currently
being the default) will use just one CPU core. The default map preview
update interval is now set to 250ms - feel free to experiment with
other values. Lower values will bring faster updates, at the expense
of wasting more time doing just updates instead of real work. Parallel
rendering can be switched on/off also directly in the map canvas by
pressing 'P' key - useful when you want to quickly compare the
difference between sequential and parallel rendering. There is another
magical shortcut, 'S' key, that will show very simple stats about the
rendering (currently just total rendering time), so you can again
quickly compare the impact of various factors (antialiasing, parallel
rendering, caching etc). These shortcuts are likely to be removed from
the final version, so make sure to use them while they are still
there!

Now, it is time for some details about the design decisions I took and
their justifications. Non-developers can happily stop reading now,
developers are encouraged to read that thoroughly :-) I would be very
happy to hear what other devs think about the changes. Nothing is set
into stone yet and any critical review will help.

- QgsMapRenderer class got deprecated (but do not worry, it still
works). The problem with the class is that does two things at once: it
stores configuration of the map and it also acts as a rendering
engine. This is impractical, because very often it is just necessary
to query or change the configuration without actually using the
rendering engine. Another problem is the fact that the rendering is
started by passing a pointer to arbitrary QPainter - it is fine for
sequential rendering, but not for parallel rendering where the
rendering happens to temporary images which are composited at any
point later. My solution was moving the map configuration (extent,
size, DPI, layers, ...) to a new class called QgsMapSettings. The
rendering engine got abstracted into a new class QgsMapRendererJob -
it is a base class with three implementations (sequential and parallel
rendering to QImage, sequential rendering to any QPainter). The class
has asynchronous API: after calling start(), the rendering will start
in the background and emit finished() signal once done. The client can
cancel() the job at any time, or call waitForFinished() to block until
the rendering is done.

- render caching has been modified. Cached images of layers used to be
stored directly in the QgsMapLayer class, however there was no context
about the stored images (what extent etc). Also, the solution does not
scale if there is more than one map renderer. Now there is a new
class, QgsMapRendererCache which keeps all cached images inside and
can be used by map renderer jobs. This encapsulation should also allow
easier modifications to the way how caching of rendered layers is
done.

- map canvas uses the new map renderer job API. Anytime the background
rendering is started, it will start periodically update the preview of
the new map (before the updates were achieved by calls to
qApp->processEvents() while rendering, with various ugly side effects
and hacks). The canvas item showing the map has become ordinary canvas
item that just stores the rendered georeferenced map image. The map
configuration is internally kept in QgsMapSettings class, which is
accessible from API. It is still possible to access QgsMapRenderer
from map canvas - there is a compatibility layer that keeps
QgsMapSettings and QgsMapRenderer in sync, so all plugins should still
work.

- rendering of a map layer has changed. Previously, it would be done
by calling QgsMapLayer::draw(...). I have found this insufficient for
safe rendering in worker thread. The issue is that during the
rendering, the user may do some changes to the internal state of the
layer which would cause fatal problems to the whole application. For
example, by changing vector layer's renderer, the old renderer would
get deleted while the worker thread is still using it. There are
generally two ways of avoiding such situations: 1. protect the shared
resource from simultaneous access by locking or 2. make a copy of the
resource. I have decided to go for the latter because: 1. there are
potentially many small resources to protect, 2. locking/waiting may
severely degrade the performance, 3. it is easy to get the locking
wrong, ending up with deadlocks or crashes, 4. copying of resources
does not need to be memory and time consuming, especially when using
implicit sharing of data (copy-on-write). I have created a new class
called QgsMapLayerRenderer. Its use case is following: when the
rendering is starting, QgsMapLayer::createMapRenderer() is called
(still in the main thread) and it will return a new instance of
QgsMapLayerRenderer. The instance has to store any data of the layer
that are required by the rendering routine. Then, in a worker thread,
its render() method is called that will do the actual rendering. Like
this, any intermediate changes to the state of the layer (or its
provider) will not affect the rendering.

- concept of feature sources. For rendering of vectors in worker
thread, we need to make sure that any data used by feature iterators
stay unchanged. For example, if the user changes the provider's query
or encoding, we are in a trouble. Feature sources abstract providers:
they represent anything that can return QgsFeatureIterator (after
being given QgsFeatureRequest). A vector data provider is able to
return an implementation of a feature source which is a snapshot of
information (stored within the provider class) required to iterate
over features. For example, in OGR provider, that is layer's file
name, encoding, subset string etc, in PostGIS it is connection
information, primary key column, geometry column and other stuff.
Feature iterators of vector data providers have been updated to deal
with provider feature source instead of provider itself. Even if the
provider is deleted while we are iterating over its data in a
different thread, everything is still working smoothly because the
iterator's source is independent from the provider. Vector layer map
renderer class therefore creates vector layer's feature source, which
in turn creates a copy of layer's edit buffer and creates a provider
feature source. From that point, QgsVectorLayer class is not used
anywhere during the rendering of the layer. Remember that most of the
copied stuff are either small bits of data or classes supporting
copy-on-write technique, so there should not be any noticeable
performance hit resulting from the copying.

- rendering of raster layers is handled by first cloning their raster
pipe and then using the cloned raster pipe for rendering. Any changes
to the raster layer state will not affect the rendering in progress.

- update to scale factors. I have always found the "scale" and "raster
scale" factors from QgsRenderContext confusing and did not properly
understand their real meaning because in various contexts (composer vs
canvas) they had different meaning and value. There were also various
rendering bugs due to wrong or no usage of these factors. By scaling
of painter before rendering and setup of correct DPI, these factors
are now always equal to one. In the future, we will be able to remove
them altogether.

- composer has also been updated to use QgsMapSettings + QgsMapRendererJob.

- labeling engine has seen some changes: it is created when starting
rendering and deleted when rendering has finished. The final labeling
is stored in a new QgsLabelingResults class, which is then propagated
(up to map canvas). Also, it is possible to cancel computation and
drawing of labeling.

- the API remained the same with only tiny changes within the
internals of labeling, diagrams, renderers and symbols, mainly due to
the fact that QgsVectorLayer is not used in the vector rendering
pipeline anymore. Callers should not see any difference (unless using
some exotic calls).

Finally, some further thoughts/questions:

- rasters - currently we do not have API to cancel requests for raster
blocks. This means that currently we need to wait until the raster
block is fully read even when we cancel the rendering job. GDAL has
some support for asynchronous requests - anyone has some experience
with it?

- rasters (again) - there are no intermediate updates of the raster
layer when rendering. What that means is that until the raster layer
is fully rendered, the preview is completely blank. There is a way to
constrain the raster block requests to smaller tiles, but what would
be the performance consequences? I am not that familiar with the way
how raster drivers are implemented in GDAL... anyone to bring some
wisdom?

- PostGIS - I had to disable reuse of connections to servers because
it is not safe use one connection from multiple threads. If reusing of
the connections is an important optimization, we will probably need to
implement some kind of connection pool from which connections would be
taken and then returned.

Okay, I think that's it. Sorry for the long mail to all the people who
read it until the end. Please give it a try - I will welcome any
feedback from the testing.

Regards
Martin


More information about the Qgis-developer mailing list