[QGIS-Developer] GPKG and FID -- can we fix this mess?

Even Rouault even.rouault at spatialys.com
Tue Oct 13 16:54:35 PDT 2020


> Well -- here's an example file:
> https://github.com/qgis/QGIS/blob/master/python/plugins/processing/tests/tes
> tdata/dissolve_polys.gml
> 
> Not sure how that file was created in the first place, but I've seen
> many like it!

Ah ok, this is a GML2 file, which reports indeed a 'fid' field (comimg from 
the XML 'id' attribute). I was testing with GML3, which reports instead a 
'gml_id' field (coming from XML 'gml:id' attribute)

That issue seems independent from the others. Nothing in the GeoPackage spec 
mandates that the unique identifier should be called 'fid'. This is just the 
default value proposed by the driver, and it is tunable. Maybe changing it to 
'ogr_fid' or 'ogc_fid' as in other database-based drivers would make such 
collision less likely ? There might be some subtle impacts if users of 
GeoPackage or scripts expect the fid column to be called 'fid' and not 
something else.

> Yes, ideally. But at this stage we can't completely hide the fid field
> without breaking existing QGIS projects. Breaking scripts is bad, but
> breaking projects is a complete no-go!

Good point...

> QgsFeature::id() isn't intended to be even semi-permanent. Just
> "mostly constant for the duration of a single data provider's
> lifetime" (i.e. a QGIS session).

Interesting. But yes indeed, I can think of the QGIS WFS provider that will 
also return QgsFeature::id() that aren't stable (reloading the layer is 
probably sufficient to cause them to change)

> Because, for fid at least, it's just an "internal detail" that we're
> showing. To use the postgres analogy we don't manage internal record
> identifiers with the returned features, just the actual exposed
> columns themselves and leave the rest to the backend. And here I think
> the backend (GDAL, or QGIS' OGR provider) should manage fids
> transparently from the client (the QgsVectorLayer).

Hum this is where things get really interesting. The PostgreSQL provider can 
expose a integer primary key as a regular column. So the sole fact of exposing 
a integer primary key as a regular column is not necessarily a recipee for 
disaster. There must be some important difference(s) that we must identify  
before jumping into conclusions.

One of the differences I can see is that the PostgreSQL provider allows to 
modify the value of the primary key column. Which isn't supported by 
OGR_L_SetFeature() (since OGR itself doesn't return the GPKG fid column as a 
OGR field, this is a QGIS only behaviour). So currently 
QgsOGRProvider::changeAttributeValues() will error out if you try to modify 
the content of the 'fid' column to a value different of QgsFeature.id())

I see the Postgres provider has some logic to have a feature id map in some 
cases (when the primary key cannot be mapped directly to a QGIS FID if I 
understand well), but for the GPKG case we aren't in that situation.

So, apart from the situation of changing the FID of an existing feature, at 
first sight, there doesn't seem to be that much difference between the GPKG 
situation and Postgres with a integer primary key that maps directly to a QGIS 
fid. Do I miss something (wouldn't be surprising. I skimmed quickly over the 
code), or is it just that difference that causes the mess?

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the QGIS-Developer mailing list