[Qgis-developer] have aggregate/window expressions ever been discussed?

Hugo Mercier hugo.mercier at oslandia.com
Wed May 28 01:30:41 PDT 2014


Le 28/05/2014 08:35, Matthias Kuhn a écrit :
> Hi All,
> 
> As the responsible person for QGIS relations, I feel obliged to share my
> thoughts in this discussion.
> 
> First of all a few notes about QGIS expressions and their current state:
> QGIS expressions are a nice and handy functionality to quickly calculate
> values based on a single feature in QGIS. This serves a lot of use-cases
> quick and easy. However, in their current state, they don't allow to do
> exactly what this thread originates from: aggregate and join data from
> other layers or use subqueries.

Yes exactly. We have more or less the WHERE clause with expressions.
Aggregates, joins and subqueries could be built using something else
(relations?)

> 
> I like the idea of using sqlites virtual tables to have immediate access
> to a huge base of functionality offered by sqlite to do complex queries.
> If we introduce this support, we have immediate support for a wide range
> of database functionality with a few lines of code. The only thing I am
> not sure (and I don't know the virtual tables implementation enough to
> answer this question) is if it is able to delegate cross-table queries
> to the original database. In short: can I do a request that requires
> data from different tables of the same database and have it executed
> directly inside the database? My suspicion is no, because the sqlite
> virtual tables will be known to sqlite as QGIS tables and it will still
> query the tables through the QGIS provider, therefore calculating e.g. a
> max functionality by querying the QGIS provider for all features and
> then calculating the "max" locally and not on the server side. This
> would be a major performance impact for customers having a single
> database that could do this calculation for us instead of doing this
> ourselves.
> 
> Instead, if we have QGIS expressions (or a QGIS query implementation on
> top of it) support for this, we are able to tell, if different tables
> are from the same database and therefore if we are able to delegate the
> whole join/aggregate/subquery whatever job to the database and let the
> database do what it's good at.
> 
> Therefore my question to the folks who know the sqlite virtual table
> code: is it possible to have sqlite virtual tables forward cross table
> queries to the database itself? Or is it possible to get access to the
> parsed query tree (or whatever the name of that may be) to determine
> based on QGIS side based on the parsed query if we are able to optimize
> by forwarding to the database.
> 

AFAIK, you do not have direct access to the query sent to a virtual
table. But the SQLite engine will ask your virtual table driver what is
the best way to resolve a constraint on columns.
For instance, if the original query has a WHERE 'a = 2', then the driver
will be asked if it knows how to quickly resolve this, using indexes
(have a look at xBestIndex at http://www.sqlite.org/vtab.html)
So I guess, you could use the remote database indexes.
But, yes that would still be not very efficient if you want to query two
tables of the same postgis database.

In my mind, SQLite virtual tables are interesting for offering a more or
less relational view on data sources that are not originally designed
for that. But it would be suboptimal for already powerful databases.

We then have two kinds of "database backends" : real databases (postgis
and ... what else ?), and pseudo-databases through SQLite virtual tables.

And we would have a conversion from QgsExpression + relations to
different dialects of SQL (SQLite / PostgreSQL / ...) in each provider ?
Does it make sense for you ?

> Concerning joins, merging this code with the relations seems a viable
> option for me in the long run. Currently, relations do no caching and
> do not "hard join" the data on the other table (meaning, joined fields
> are not available as fields) and maybe there are other things missing
> (I remember a thread here on the ML about this before)
> 
Yes in my opinion, we should try to have the same code for what we call
"joins" and more general relations. And on the GUI part, some
redundancies between these two concepts may have to be reduced as well.



More information about the Qgis-developer mailing list