[Qgis-developer] QEP/RFC sqlite virtual tables
Matthias Kuhn
matthias.kuhn at gmx.ch
Wed Oct 22 02:21:33 PDT 2014
Moving this to a new thread (from 1:many relation...
On 22.10.2014 10:38, Hugo Mercier wrote:
> Well, QgsExpression has been designed to mimick the SQL WHERE clause, so
> yes I agree.
> My point is that if you want to extend expression to support joins,
> aggregates, window functions, etc. you end up recreating SQL, but
> probably a less powerful one.
You can call it mimicking, you can also call it implementing it's own
SQL (subset) dialect.
Yes, you end up recreating SQL. How powerful it will be is hard to say
right now, right?
> The difference I can see between expressions and SQL is that functions
> used in expressions may have no direct equivalent SQLite-side. But it
> can be resolved by using user-defined functions in SQLite (that would
> just call code from QgsExpression for instance), so that the SQL dialect
> would be seen as an extension of the current QgsExpression, not
> something different.
>
>>> :)
>>> :
>>>> I don't see anything that prevents virtual layer queries from being
>>>> optimized server-side. It has to be taken into account during the
>>>> implementation of virtual layers, of course, but it is hard to exhibit
>>>> the precise way it will work now without having an almost complete
>>>> implementation.
>>> I was referring to this discussion here.
>>>
>>> https://github.com/qgis/QGIS-Enhancement-Proposals/pull/5#issuecomment-5814
>>> 8788
>>>
>>> Sorry I did not respond again, but I don't think a complete optimization
>>> is possible without having deep access to virtual table functionality in
>>> sqlite (or having to parse the same SQL on our end and replace parts of
>>> it which basically comes close to writing our own engine again).
>>> But let's discuss this on the QEP PR instead of inside this ml thread.
>> Just to give you my feeback based on my implementation of SQLite SQL dialect
>> in OGR (
>> https://github.com/OSGeo/gdal/blob/trunk/gdal/ogr/ogrsf_frmts/sqlite/ogrsqlitevirtualogr.cpp
>> ), I also doubt that you could easily inform the server that a join should be
>> done. AFAIR I could forward simple filtering to the server/backend (things like
>> "column_A >= 5 AND column_A <= 10 AND column_B != 10") by implementing the
>> xFilter virtual method, but Virtual Tables are not informed of joins happening
>> (the sqlite virtual table API hardy exposes SQL at all). So that would indeed
>> require analyzing the SQL on QGIS side.
>>
>> AFAIR, I've just verified that JOIN like "t1.col1 = t2.col2" are seen by the
>> virtual table implementation like successive filters "t1.col1 = val1", "t1.col1
>> = val2", where val1, val2 are values from t2.col2.
>>
>> And regarding xFilter, even filters with OR etc.. are hidden to the virtual
>> table implementation and evaluated only by SQLite.
>>
> Thanks Even for your feedback.
>
> The idea under virtual layers is that SQLite may be exposed to advanced
> users if they want to use a complex query on some data sources.
>
> It is very interesting to use SQLite to replace existing features from a
> code point of view because it simplifies implementation.
> But exposing a SQL query for already existing features such as joins may
> not be desirable for standard users.
>
> So for joins, since the information comes from QGIS ("I want to join
> tables t1 and t2 on this column predicate"), the SQL query can be
> deduced and some other metadata is also known, like the fact that t1 and
> t2 may be from the very same database.
> In that case the virtual layer may be created with some metadata saying
> that this is in fact a join that must be optimized. And then the virtual
> layer provider can ask the underlying provider (say postgis) if it can
> process the whole join query.
I am not only referring to the current relations implementation here. I
don't think aggregate functions need necessarily be based on creating a
relation definition in the project file (Just like in a database you
also don't need a foreign key constraint to create join).
But if I want to create a new layer or an ad-hoc iterator that joins
data or makes use of other advanced relational database features, it
would be nice if QGIS could forward such requests to the database. Say I
want to create a symbology based on the amount of waste within
countries, QGIS should be able to execute this request on the database.
In this scenario you could likely want to define this request inside an
expression (e.g. a virtual column).
So basically, I think we need a functionality that is able to work as a
database abstraction layer. Forward wherever possible to the db, use a
(local) fallback implementation where not.
I am not opposed to use SQLite for this. There is no reason for us to
reinvent the wheel, the NIH syndrome should not be our driver. But I see
some limitations and I would rather prefer to have these properly
addressed. And I will be more than happy if you tell me that you
convinced the sqlite virtual table developers that they should extend
their API to expose the parsed SQL structure and allow to tweak it. Then
we have a fallback implementation which is well-tested but we still have
the possibility to fine-tune requests for big databases.
Regards,
Matthias
--
Help getting QGIS to the next level of quality before November 15!
http://blog.vitu.ch/10102014-1046/crowdfunding-initiative-automated-testing
More information about the Qgis-developer
mailing list