[QGIS-Developer] Tackling Postgis layers connection recover

Timothé Perez timothe at pragma-innovation.fr
Mon Feb 4 11:35:13 PST 2019


Hello qgis-devs,

This is my first message here and first contribution to QGIS, so I thank you in advance for being indulgent and do not hesitate to correct me on any mistake I will make.
I didn't want to jump right into proposing my patch "as is" in a PR as I'm new to this project, so first I wanted to discuss about it to make sure I got it right.

Here's a recap of what I have found so far (sorry if it's a bit long):

I'm using QGIS 3.4 with a PostgreSQL database to store layers and I'm facing the same issue as described in https://issues.qgis.org/issues/20170 : unrecoverable PostgreSQL connections.

I have cloned the repo and started to dig, as it is really annoying because it forces you to abandon your changes and close and reopen your project.

To reproduce the problem, the simplest way is to spin a local PostgreSQL database with postgis and create a table with just a serial and a geometry:
 CREATE TABLE foo (id serial primary key, geometry GEOMETRY(POINT, 4326)); 

Open it in QGIS, create several features, save them then simply restart the PostgreSQL service so that all connections are forced to be closed.
QGIS logs will display that the connections to PostgreSQL were lost but recovered and features will still be accessible.

However if I start editing the layer by adding a feature and then call save, it will fail:

2019-02-04T19:11:30     CRITICAL    Layer foo : PostGIS error while adding features: FATAL: terminating connection due to administrator command
             la connexion au serveur a été coupée de façon inattendue
              Le serveur s'est peut-être arrêté anormalement avant ou durant le
              traitement de la requête.
        
2019-02-04T19:11:30     WARNING    Commit errors : Could not commit changes to layer foo
2019-02-04T19:11:34     CRITICAL    Layer foo : PostGIS error while adding features: no result buffer
2019-02-04T19:11:34     WARNING    Commit errors : Could not commit changes to layer foo

My only option is to cancel my edits and reload the project to regain full access to the db.

So in fact this problem has 2 causes: (time to dig in the C++ code) 

First:

The QgsPostgresProvider::connectionRO() method does not offer a proper way to reset the connection when the connection was lost; therefore every SELECT after a connection was lost will  cause the Q_ASSERT to fail in  src/providers/postgres/qgspostgresconn.cpp, line 86 and will promptly crash my debug build of QGIS.

Let's take for example QgsPostgresProvider::featureCount(): it checks connectionRO in case it is null, but doesn't check if the connection is still valid and will call connectionRO()->PQexec( sql ) in any case. Now PQexec is indeed checking the connection status, but will not try to reset the connection and will simply return a nullptr, which is then passed to QgsPostgresResult::PQgetvalue and will cause the assert to fail. 

Note that PQexecNR is properly handling checks on the status and will call PQreset, resulting in a successful re-connection and request to Postgresql ( that's why other connections are not affected and can recover) but PQexecNR is never called on ConnectionRO, so the connection remains staled.

I see 2 paths to fix this issue:

- Implement a check in  QgsPostgresProvider::connectionRO()  that will call a new QgsPostgresConn::PQreset method  which in turn call libpq PQreset on its mConn attribute, resulting in a proper connection reset. As ConnectionRO is const, we can't simply create a new connection so we can only reset the existing one. 

- Implement the same logic as in PQexecNR to handle reset in case of staled connection, ie add a retry flag and a call to libpq PQreset.

I quickly tried the first approach and it fixed the issue; however this does not provide an automatic retry mechanism on the first failure, so maybe the second one is better.


Now the second cause:

I couldn't understand why QGIS was emitting a SELECT statement before an INSERT, and the answer is simple:

2019-02-04T19:11:34     WARNING    Connection error: SELECT nextval('foo_id_seq'::regclass) returned 1 [FATAL: terminating connection due to administrator command

Right before the INSERT query ( which would otherwise work), QGIS makes a SELECT to get the default value for the primary key field.

The culprit here is a bad condition in the optimization in  QgsPostgresProvider::addFeatures when it checks for a PK field which is using a sequence: it only checks if the value is not null, but as a primary key in not null by definition, QGIS automatically sets the not null constraint and default to the SQL default of nextval('foo_id_seq'::regclass). So in fact the optimization is never used and a SELECT is issued. Adding a check to ensure value does not start with next_val  fixes the issue and the optimization is used.


I am willing to propose a PR if my fixes make sense and are acceptable, this will fix an annoying issue.

Thanks in advance for your feedbacks,
-- 
Timothé




More information about the QGIS-Developer mailing list