[QGIS-Developer] GPKG and FID -- can we fix this mess?
Even Rouault
even.rouault at spatialys.com
Tue Oct 13 16:54:35 PDT 2020
> Well -- here's an example file:
> https://github.com/qgis/QGIS/blob/master/python/plugins/processing/tests/tes
> tdata/dissolve_polys.gml
>
> Not sure how that file was created in the first place, but I've seen
> many like it!
Ah ok, this is a GML2 file, which reports indeed a 'fid' field (comimg from
the XML 'id' attribute). I was testing with GML3, which reports instead a
'gml_id' field (coming from XML 'gml:id' attribute)
That issue seems independent from the others. Nothing in the GeoPackage spec
mandates that the unique identifier should be called 'fid'. This is just the
default value proposed by the driver, and it is tunable. Maybe changing it to
'ogr_fid' or 'ogc_fid' as in other database-based drivers would make such
collision less likely ? There might be some subtle impacts if users of
GeoPackage or scripts expect the fid column to be called 'fid' and not
something else.
> Yes, ideally. But at this stage we can't completely hide the fid field
> without breaking existing QGIS projects. Breaking scripts is bad, but
> breaking projects is a complete no-go!
Good point...
> QgsFeature::id() isn't intended to be even semi-permanent. Just
> "mostly constant for the duration of a single data provider's
> lifetime" (i.e. a QGIS session).
Interesting. But yes indeed, I can think of the QGIS WFS provider that will
also return QgsFeature::id() that aren't stable (reloading the layer is
probably sufficient to cause them to change)
> Because, for fid at least, it's just an "internal detail" that we're
> showing. To use the postgres analogy we don't manage internal record
> identifiers with the returned features, just the actual exposed
> columns themselves and leave the rest to the backend. And here I think
> the backend (GDAL, or QGIS' OGR provider) should manage fids
> transparently from the client (the QgsVectorLayer).
Hum this is where things get really interesting. The PostgreSQL provider can
expose a integer primary key as a regular column. So the sole fact of exposing
a integer primary key as a regular column is not necessarily a recipee for
disaster. There must be some important difference(s) that we must identify
before jumping into conclusions.
One of the differences I can see is that the PostgreSQL provider allows to
modify the value of the primary key column. Which isn't supported by
OGR_L_SetFeature() (since OGR itself doesn't return the GPKG fid column as a
OGR field, this is a QGIS only behaviour). So currently
QgsOGRProvider::changeAttributeValues() will error out if you try to modify
the content of the 'fid' column to a value different of QgsFeature.id())
I see the Postgres provider has some logic to have a feature id map in some
cases (when the primary key cannot be mapped directly to a QGIS FID if I
understand well), but for the GPKG case we aren't in that situation.
So, apart from the situation of changing the FID of an existing feature, at
first sight, there doesn't seem to be that much difference between the GPKG
situation and Postgres with a integer primary key that maps directly to a QGIS
fid. Do I miss something (wouldn't be surprising. I skimmed quickly over the
code), or is it just that difference that causes the mess?
--
Spatialys - Geospatial professional services
http://www.spatialys.com
More information about the QGIS-Developer
mailing list