<div dir="ltr">Hi Martin<div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Dec 12, 2013 at 2:14 PM, Martin Dobias <span dir="ltr"><<a href="mailto:wonder.sk@gmail.com" target="_blank">wonder.sk@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi everyone!<br>
<br>
[attention: long text ahead]<br>
<br>
In recent weeks I have been working on moving map rendering into<br>
background worker threads and all related infrastructure changes.<br>
There is still quite a lot of work to do, but finally I think it is a<br>
time for a preview and a broader discussion about the whole thing. Not<br>
every little QGIS feature is working yet, but things work fine with<br>
most commonly used data sources (GDAL/OGR, PostGIS, SpatiaLite).<br>
Please give it a try! The code is available in my QGIS repository on<br>
GitHub, the branch is called threading-revival:<br>
<a href="https://github.com/wonder-sk/QGIS/tree/threading-revival" target="_blank">https://github.com/wonder-sk/QGIS/tree/threading-revival</a><br>
<br>
The plan is to continue working on the project in the following weeks<br>
to reintroduce support for features and data providers currently not<br>
supported (e.g. WMS, WFS). Hopefully by the time of feature freeze in<br>
late January the code will in condition to be merged to master, so the<br>
multi-threaded rendering can appear in QGIS 2.2 release.<br>
<br>
The project has already quite some history: it started as my GSoC<br>
project in summer of 2010, unfortunately it was not merged back to<br>
master branch because the code never get into production level<br>
quality. The scope of the project is not just about moving rendering<br>
into background: it is mostly about updating various pieces of QGIS<br>
core library and data providers to behave correctly in the case that<br>
more threads simultaneously try to access the same resource - until<br>
now the assumption always was that there was only one thread. Back in<br>
2010, QGIS code was much less ready to change those assumptions. Now,<br>
after the release of 2.0, the code is much closer to what we need for<br>
multi-threaded rendering: both vector and raster layer code went<br>
through a major overhaul in the preparation for 2.0.<br>
<br>
What to expect from the project:<br>
1. better user experience. Browsing the map in canvas gets much<br>
snappier - pan and zoom map smoothly with instant preview, without<br>
having to wait until rendering of the previous view is finished,<br>
without flickers or other annyoances. Even if the map takes longer to<br>
render, you are free to do any actions in the meanwhile. It is a bit<br>
hard to describe the difference of the overall feel, one needs to try<br>
it out :)<br>
<br>
2. faster rendering of projects with more layers. Finally, it is<br>
possible to use the full power of your CPU. The rendering of map<br>
layers can be done in parallel: layers will be rendered separately at<br>
the same time and then composited together to form the final map<br>
image. In theory, rendering of two layers can get twice as fast. The<br>
speedup depends a lot on your data.<br>
<br>
3. starting point for more asynchronous operations. With safe access<br>
to map layers from worker threads, more user actions could be<br>
processed in background without blocking GUI, e.g. opening of<br>
attribute table, running analyses, layer identification or change of<br>
selection.<br>
<br>
What not to expect from the project:<br>
- faster rendering of one individual layer. A project with one layer<br>
that took five seconds to render will still take five seconds to<br>
render. The parallelization happens at the level of map layers. With<br>
one map layer QGIS will still use just one core. Optimizing the<br>
rendering performance of an individual layer is outside of the scope<br>
of this project.<br>
<br>
What to expect from the project *right now*: things should generally<br>
work, except for the following:<br>
- data providers: delimited text, gpx, grass, mssql, sql anywhere, wfs, wms, wcs<br>
- QGIS server<br>
- point displacement renderer<br>
<br>
For testing, simply use QGIS as you would usually do and see if you<br>
feel a difference when browsing the map. In Options dialog, Rendering<br>
tab, there are few new configuration options for you to play with: 1.<br>
parallel or sequential rendering, 2. map update interval. The parallel<br>
rendering may use all your CPU power, while sequential (currently<br>
being the default) will use just one CPU core. The default map preview<br>
update interval is now set to 250ms - feel free to experiment with<br>
other values. Lower values will bring faster updates, at the expense<br>
of wasting more time doing just updates instead of real work. Parallel<br>
rendering can be switched on/off also directly in the map canvas by<br>
pressing 'P' key - useful when you want to quickly compare the<br>
difference between sequential and parallel rendering. There is another<br>
magical shortcut, 'S' key, that will show very simple stats about the<br>
rendering (currently just total rendering time), so you can again<br>
quickly compare the impact of various factors (antialiasing, parallel<br>
rendering, caching etc). These shortcuts are likely to be removed from<br>
the final version, so make sure to use them while they are still<br>
there!<br>
<br>
Now, it is time for some details about the design decisions I took and<br>
their justifications. Non-developers can happily stop reading now,<br>
developers are encouraged to read that thoroughly :-) I would be very<br>
happy to hear what other devs think about the changes. Nothing is set<br>
into stone yet and any critical review will help.<br>
<br>
- QgsMapRenderer class got deprecated (but do not worry, it still<br>
works). The problem with the class is that does two things at once: it<br>
stores configuration of the map and it also acts as a rendering<br>
engine. This is impractical, because very often it is just necessary<br>
to query or change the configuration without actually using the<br>
rendering engine. Another problem is the fact that the rendering is<br>
started by passing a pointer to arbitrary QPainter - it is fine for<br>
sequential rendering, but not for parallel rendering where the<br>
rendering happens to temporary images which are composited at any<br>
point later. My solution was moving the map configuration (extent,<br>
size, DPI, layers, ...) to a new class called QgsMapSettings. The<br>
rendering engine got abstracted into a new class QgsMapRendererJob -<br>
it is a base class with three implementations (sequential and parallel<br>
rendering to QImage, sequential rendering to any QPainter). The class<br>
has asynchronous API: after calling start(), the rendering will start<br>
in the background and emit finished() signal once done. The client can<br>
cancel() the job at any time, or call waitForFinished() to block until<br>
the rendering is done.<br>
<br>
- render caching has been modified. Cached images of layers used to be<br>
stored directly in the QgsMapLayer class, however there was no context<br>
about the stored images (what extent etc). Also, the solution does not<br>
scale if there is more than one map renderer. Now there is a new<br>
class, QgsMapRendererCache which keeps all cached images inside and<br>
can be used by map renderer jobs. This encapsulation should also allow<br>
easier modifications to the way how caching of rendered layers is<br>
done.<br>
<br>
- map canvas uses the new map renderer job API. Anytime the background<br>
rendering is started, it will start periodically update the preview of<br>
the new map (before the updates were achieved by calls to<br>
qApp->processEvents() while rendering, with various ugly side effects<br>
and hacks). The canvas item showing the map has become ordinary canvas<br>
item that just stores the rendered georeferenced map image. The map<br>
configuration is internally kept in QgsMapSettings class, which is<br>
accessible from API. It is still possible to access QgsMapRenderer<br>
from map canvas - there is a compatibility layer that keeps<br>
QgsMapSettings and QgsMapRenderer in sync, so all plugins should still<br>
work.<br>
<br>
- rendering of a map layer has changed. Previously, it would be done<br>
by calling QgsMapLayer::draw(...). I have found this insufficient for<br>
safe rendering in worker thread. The issue is that during the<br>
rendering, the user may do some changes to the internal state of the<br>
layer which would cause fatal problems to the whole application. For<br>
example, by changing vector layer's renderer, the old renderer would<br>
get deleted while the worker thread is still using it. There are<br>
generally two ways of avoiding such situations: 1. protect the shared<br>
resource from simultaneous access by locking or 2. make a copy of the<br>
resource. I have decided to go for the latter because: 1. there are<br>
potentially many small resources to protect, 2. locking/waiting may<br>
severely degrade the performance, 3. it is easy to get the locking<br>
wrong, ending up with deadlocks or crashes, 4. copying of resources<br>
does not need to be memory and time consuming, especially when using<br>
implicit sharing of data (copy-on-write). I have created a new class<br>
called QgsMapLayerRenderer. Its use case is following: when the<br>
rendering is starting, QgsMapLayer::createMapRenderer() is called<br>
(still in the main thread) and it will return a new instance of<br>
QgsMapLayerRenderer. The instance has to store any data of the layer<br>
that are required by the rendering routine. Then, in a worker thread,<br>
its render() method is called that will do the actual rendering. Like<br>
this, any intermediate changes to the state of the layer (or its<br>
provider) will not affect the rendering.<br>
<br>
- concept of feature sources. For rendering of vectors in worker<br>
thread, we need to make sure that any data used by feature iterators<br>
stay unchanged. For example, if the user changes the provider's query<br>
or encoding, we are in a trouble. Feature sources abstract providers:<br>
they represent anything that can return QgsFeatureIterator (after<br>
being given QgsFeatureRequest). A vector data provider is able to<br>
return an implementation of a feature source which is a snapshot of<br>
information (stored within the provider class) required to iterate<br>
over features. For example, in OGR provider, that is layer's file<br>
name, encoding, subset string etc, in PostGIS it is connection<br>
information, primary key column, geometry column and other stuff.<br>
Feature iterators of vector data providers have been updated to deal<br>
with provider feature source instead of provider itself. Even if the<br>
provider is deleted while we are iterating over its data in a<br>
different thread, everything is still working smoothly because the<br>
iterator's source is independent from the provider. Vector layer map<br>
renderer class therefore creates vector layer's feature source, which<br>
in turn creates a copy of layer's edit buffer and creates a provider<br>
feature source. From that point, QgsVectorLayer class is not used<br>
anywhere during the rendering of the layer. Remember that most of the<br>
copied stuff are either small bits of data or classes supporting<br>
copy-on-write technique, so there should not be any noticeable<br>
performance hit resulting from the copying.<br>
<br>
- rendering of raster layers is handled by first cloning their raster<br>
pipe and then using the cloned raster pipe for rendering. Any changes<br>
to the raster layer state will not affect the rendering in progress.<br>
<br>
- update to scale factors. I have always found the "scale" and "raster<br>
scale" factors from QgsRenderContext confusing and did not properly<br>
understand their real meaning because in various contexts (composer vs<br>
canvas) they had different meaning and value. There were also various<br>
rendering bugs due to wrong or no usage of these factors. By scaling<br>
of painter before rendering and setup of correct DPI, these factors<br>
are now always equal to one. In the future, we will be able to remove<br>
them altogether.<br>
<br>
- composer has also been updated to use QgsMapSettings + QgsMapRendererJob.<br>
<br>
- labeling engine has seen some changes: it is created when starting<br>
rendering and deleted when rendering has finished. The final labeling<br>
is stored in a new QgsLabelingResults class, which is then propagated<br>
(up to map canvas). Also, it is possible to cancel computation and<br>
drawing of labeling.<br>
<br>
- the API remained the same with only tiny changes within the<br>
internals of labeling, diagrams, renderers and symbols, mainly due to<br>
the fact that QgsVectorLayer is not used in the vector rendering<br>
pipeline anymore. Callers should not see any difference (unless using<br>
some exotic calls).<br>
<br>
Finally, some further thoughts/questions:<br>
<br>
- rasters - currently we do not have API to cancel requests for raster<br>
blocks. This means that currently we need to wait until the raster<br>
block is fully read even when we cancel the rendering job. GDAL has<br>
some support for asynchronous requests - anyone has some experience<br>
with it?<br>
<br>
- rasters (again) - there are no intermediate updates of the raster<br>
layer when rendering. What that means is that until the raster layer<br>
is fully rendered, the preview is completely blank. There is a way to<br>
constrain the raster block requests to smaller tiles, but what would<br>
be the performance consequences? I am not that familiar with the way<br>
how raster drivers are implemented in GDAL... anyone to bring some<br>
wisdom?<br>
<br>
- PostGIS - I had to disable reuse of connections to servers because<br>
it is not safe use one connection from multiple threads. If reusing of<br>
the connections is an important optimization, we will probably need to<br>
implement some kind of connection pool from which connections would be<br>
taken and then returned.<br>
<br>
Okay, I think that's it. Sorry for the long mail to all the people who<br>
read it until the end. Please give it a try - I will welcome any<br>
feedback from the testing.<br>
<br></blockquote><div><br></div><div>That all sounds absolutely brilliant! Thanks for such a nice clear description of how it all fits together. I know you are only considering layer-by-layer rendering but does your design accommodate further future optimisations easily? I'm thinking of things like:</div>
<div><br></div><div>* predictive / off screen rendering of 3x3 canvas dimensions after the initial render so that any pan is near instantaneous (and would trigger a new off-screen render)</div><div>* on zoom, resample first then over render the resampled image (like open layers and other web toolkits do so you see a resampled version of the old render which gets overpainted as the new render comes in)</div>
<div>* symbol layer render in threads - I believe even single layer draws can benefit greatly from the render-then-composite approach you are taking - rendering each feature into a buffer when symbol layers are enabled (and then compositing the buffers after rendering them) means that no feature should need to be retrieved more than once when rendering.</div>
<div>* render caching of symbol layers (so that if only one layer gets changed not all others need to be re-rendered)</div><div>* progressive rendering (e.g. rendering in a generalised way ala <span name="A Huarte" class="" style="font-size:12.800000190734863px;font-family:arial,sans-serif">A Huarte<span style="white-space:nowrap">'s patches and then update the display and the start a second pass render with full detail) - that way you get a very fast first render but if you stick around at that AOI the render quality improves as the second pass happens</span></span></div>
<div><span name="A Huarte" class="" style="font-size:12.800000190734863px;font-family:arial,sans-serif"><span style="white-space:nowrap"><br></span></span></div><div><br></div><div>I know there are possible issues with memory consumption with some of the above ideas, and they are definitely not on your current roadmap, but it would be good to at least play a little mental soccer with the above ideas and see if the architecture you have devised can accommodate such further optimisations cleanly in the future.</div>
<div><br></div><div>I just built my copy of your branch while I wrote this email - can't wait to go and try it out now!</div><div><br></div><div>Regards</div><div><br></div><div>Tim</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Regards<br>
Martin<br>
_______________________________________________<br>
Qgis-developer mailing list<br>
<a href="mailto:Qgis-developer@lists.osgeo.org">Qgis-developer@lists.osgeo.org</a><br>
<a href="http://lists.osgeo.org/mailman/listinfo/qgis-developer" target="_blank">http://lists.osgeo.org/mailman/listinfo/qgis-developer</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Tim Sutton - QGIS Project Steering Committee Member<br>==============================================<br>Please do not email me off-list with technical<br>
support questions. Using the lists will gain<br>more exposure for your issues and the knowledge<br>surrounding your issue will be shared with all.<br><br>Irc: timlinux on #qgis at <a href="http://freenode.net" target="_blank">freenode.net</a><br>
==============================================</div>
</div></div>