[Qgis-developer] Communicating Data-problems/issued to the users

Richard Duivenvoorde rdmailings at duif.net
Tue May 2 00:32:46 PDT 2017


I have been looking into an issue [0] where the GML file was actually
wrong (identical fid's for every feature), but QGIS is misbehaving
because it shows all 15 geometries, but only 1 line in the attribute table.

A lot of problems that pop up on mailing lists or issue tracker are
actually DATA related. But because the uses do not know enough about the
data/specs or structure for them QGIS is always the one to blame...

Given that I thought that it would be nice to be able to run QGIS in a
'stricter' mode, in which QGIS before or during opening a dataset does
some more checking/testing of that data.

Some examples I can think of (from recent experiences):

GML (or other OGR data sources)
- check while loading the data if fid's are unique (because if not...
see problem above)
- check with the first 10 geometries loaded if the coordinates are
actually in range of the CRS (else...)

- make sure there is a primary key on tables (maybe even do a scan on
all tables and check if they have those)

- check if the table has at least one geom column registred in the
- the srs in the geometry is the same as in the metadata (else errors
when doing spatial things IN the db)

- see [1], an issue where people feed MultiPoint (-20 -90, -20 -88 )
while it should be MultiPoint ((-20 -90),(-20 -88))

- where I see:
//TODO - add sanity check for shape file layers, to include checking to
//       see if the .shp, .dbf, .shx files are all present and the layer
//       actually has features
bool QgsOgrProvider::isValid() const
  return mValid;

etc etc I think everybody can come up with such tests.

Discussing this with Matthias, we realize that there CAN be a
performance penalty on this 'premature' tests, but I think both for QGIS
and the user it is best when the user is informed about 'data problems'...

Main points I want to make:

- QGIS should either be as informative as possible to inform the user of
bad data structures (throw Exceptions?), or be able to do such tests on
demand. Not sure of too much forgiveness helps QGIS (as in [1])

- thinking about the 'one button that loads them all' discussion in
Essen see [2] we should maybe think about a 'Data Management Dialog' in
which you can both Load, Create AND TEST Datasources?

Maybe a grant for adding such tests? To make QGIS more 'data-robust'?

Regards and sorry for my long emails but it sometimes bugs me that QGIS
is so full of features but silently chokes or doesn't work because of
data-related problems,

Richard Duivenvoorde

[0] https://issues.qgis.org/issues/16480
[1] https://issues.qgis.org/issues/16483

More information about the QGIS-Developer mailing list