[QGIS-Developer] GeoPackage - where are we -where do we go

Matthias Kuhn matthias at opengis.ch
Fri May 8 02:30:24 PDT 2020

Hi list,

I wondered about the state of GeoPackage. Personally, cince it has been 
introduced to qgis and evenmore since it has been selected as the 
default format, I have never grown to fully and completely.

I do not want to trigger a evangelical discussion here. I'd like to see 
where we are and what we can reasonably do to have a default file format 
which can be recommended with no bad feelings.

Here follow a couple of observations over the years, some of them 
properties of the specs I believe:

* The fid requirement

   I sometimes want my features to be identified by uuids or others. 
They also tend to accumulate if derived datasets are created (through 
processing etc). If I need some pseudo stable primary key there is a 
rowid builtin into sqlite, we don't need a second one.

   Possible mitigation: alter the ogr implementation. possibly alter the 
standard (required?)

* The modification on r/o open

   Has caused too much pain on git.

   Possible mitigation: a) switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351) b) only switch 
to wal mode when layers are put into edit mode (I have strong doubts 
this is a safe thing to do)

* The network share freeze

   Our default file should play nicely with (windows) network shares. 
It's clear to everyone that we can't expect concurrent writes. But it 
should "just work" for concurrent read by many.

   Possible mitigation: switch to journal mode=delete for network shares 
(we are looking into this)

* The wal file appearing next to the file

   It is confusing to newcomers and looks almost like a sidecar file. I 
would care less if it was put into some system cache folder instead of 
just into my data folder. Or at least if it was a hidden file.

   Possible mitigation: switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351)

* The couple of corrupted files I have received over the years which 
could only be repaired by a command line "dump contents as sql and 
execute into new file"

   I have not found a way to reproduce this. Some of them were produced 
by older qgis versions making it easy to violate foreign key constraints 
and hard to recover. This has been fixed.

   Possible mitigation: offer a "repair" option in qgis. Through 
processing or "on the fly" upon detection.

*Default value magic replace values on insert (with no possibility to 
pre-evaluate them)

   E.g. a global sequence like on postgres would be nice. Can be worked 
around through default values in qgis though.

   Possible mitigation: a)add it as a feature to sqlite. b) use qgis 
default values. c) live with it.

*The requirement for a single geometry column per table

   I just don't see a good reason to forbid that

   Possible mitigation: a) alter the standard. b) ignore the standard 
and patch the ogr implementation.

I wonder how others feel about these topics.

- Are there more pain points I forgot to list?

- Do you see more approaches to mitigate these problems?

- Is someone already working on these issues?

It would be great to have a standard file format that we can fully 
trust. Let's make a reality check if GeoPackage can be this format.

Best regards

Matthias Kuhn
