[Qgis-developer] in-memory layer as cache in DBManager

Tue Sep 2 12:04:33 PDT 2014

Hello devs,

during my work on DBManager Oracle implementation, I've been faced with 
large layers that take time to open (more than 10 seconds is quite a lot 
for the users in my office). It is even worse with dynamic queries 
because QGis considers that there is no spatial indexes available (which 
is true on most cases). In this situation, it takes the same amount of 
time to show a limited set of data and the whole set.

Other GIS tools (ArcGIS 10.x and MapInfo 10.x) provide a very basic 
cache feature. Once the layer (or query) content has been loaded in 
memory, every operation (scan/pan or geoprocessing) is really fast 
(about milliseconds compared to more than 10 seconds). There is no cache 
synchronisation mechanisms but it just works (from the user point of 
view).

I know that there has been discussions about implementing a cache 
mechanism in QGis but I see nothing in the code for the moment (perhaps 
am I wrong ?).

I think that for the moment, in-memory layers are the nearest approach 
for a cache system. They have been deployed long ago and I think that we 
can work with them to offer a very basic-but-easy-to-use cache 
mechanism.

I've played so far with Python code to make an in-memory layer from a 
(what I think is becoming a) big Oracle table (about 250000 objects). 
Performances are quite good: it takes about one second more to build the 
in-memory layer than retrieving the whole dataset (17 seconds). Data are 
transfered only once during the copy features operation, so the total 
amount of time of building the in-memory layer is the time of the query 
duration. Once the in-memory layer is loaded, you can pan and zoom at 
all levels in less than one second. The only drawback I've found so far 
is some freezing on attributes table. It takes about ten seconds to open 
the attributes table and scrolling in it is very slow. I've noticed that 
when you replace floats fields by integer ones, scrolling is a little 
bit faster.

 From the user point of view, in-memory layers are similar to a basic 
caching system:
- you grab all the data from the SRDBMS the first time you open the 
layer.
- Then you are able to do fast data analysis or rendering...
- knowing that if you want to work on true live data, you have to open 
the real layer (no synchronisation).

For the moment, the only practical way for a user to build an in-memory 
layer is by selecting all the features of a layer (which takes time as 
selection launches a refresh query), copy them into clipboard and clip 
them in an in-memory layer. Furthermore, this technique doesn't work 
with Date/Time attributes (the layer cannot be created).

You can build in-memory copies of a layer by using the Python console 
but this is not really user-friendly. I believe that a good entry point 
to load an in-memory layer is in DBManager, on the dbtree or in the SQL 
editor widget.

Why add this new functionnality there ? First, just because the Python 
implementation is easier to code and there will be no modification on 
the core code of QGis. Then it seems that the performances will be good. 
Furthermore, SQL queries made with the SQL dialog box of the plugin are 
good candidates to in-memory layers. Loading them as in-memory layers 
can be faster than building them and saving them in a local file (data 
are transfered only once from the SRDBMS server).

I'd like to know what do you think about this practical approach of 
caching ? Do you think it is a good idea to just add a button or a 
checkbox or a menu entry to load a layer or an SQL query as an in-memory 
layer ? Do you think DBManager is a good candidate for this user 
interface ?

Best regards,

PS: I know that the ideal situation would be a true cache mechanism 
directly on core code but (I think) it is far more complex to 
implement...

-- 
Médéric RIBREUX
http://medspx.homenet.org