[Qgis-developer] Geospackage Slow in QGIS

a.furieri at lqt.it a.furieri at lqt.it
Sun May 18 02:38:35 PDT 2014


Hi all,

just a cumulative reply to this thread:

> Le samedi 17 mai 2014 10:46:39, Stefan Keller a écrit :
> Looks strange to me why a single file check woold be the bottlenet 
> since
> this should be a single call.
>
> I would have expected that some repeated calls (like reading
> objects/records and checking SRID or data types) would be the 
> bottleneck
> since SQlite is not performant when used with PRAGMA checks. This 
> would be
> a check one could disable with a parameter.
>

Stefan,

it's indisputably true that's a single call; anyway invoking
this PRAGMA implies an impressively huge workload.
Accordingly to the SQLite's own documentation:

"This pragma does an integrity check of the entire database.
  The integrity_check pragma looks for out-of-order records,
  missing pages, malformed records, and corrupt indices."

Very short said: the sqlite's engine is forced to read and
check the whole db-file from start to end; and accordingly
to this the expected execution time will be approximatively
proportional to the db-file size.


On Sat, 17 May 2014 12:48:57 +0200, Even Rouault wrote:
> I've just noticed that thread. Actually I found that integrity_check
> was slow when operating on remote databases with /vsicurl/ and didn't
> verify how slow  it could be with local big files
>

Even,

I've tested a SpatiaLite DB containing about ten million
polygons (3+ GB); and this PRAGMA actually required 15/20
minutes before completing.
this definitely confirms that the execution time increases
proportionally to the db-file size.


> So I've disabled the check by default.
>

seems to be the only practical option.
executing this "PRAGMA integrity_check" is really justified
only after noticing some puzzling and extraordinary behaviour
leading to the reasonable suspect that some nasty data
corruption possibly affected the db-file.


On Sun, 18 May 2014 08:19:00 +1200, Jeremy Palmer wrote:
> I'm wondered why QGIS needs a native SpatiaLite provider and
> connection dialogue any more. Unlike other database providers such as
> PostGIS, Oracle, MSSQL the SpatiaLite provider doesn't seem to have
> anything special that requires a QGIS provider. From a users
> perspective it just adds to the confusion and complexity for adding
> data. Why not just use GDAL/OGR for both Geopackage and Spatialite?
>

Jeremy,

I fully agree with you; this seems to be a very nice idea.
and please note: not from an user perspective, but mainly
from a developer/maintainer perspective.

- QGIS mainly uses SpatiaLite as a generic data repository,
   and does very little usage of the many advanced features
   supported by SpatiaLite (Spatial SQL extensions supporting
   full Spatial Analysis, routing facility, direct access
   to external shapefiles, CSV files, DXF files, XLS spreadsheets,
   WFS servers, direct support to ISO Metadata and SLD/SE styles
   and so on).
   basically QGIS simply requires just a bere INSERT/UPDATE/DELETE
   minimal support, and requires an optimized access strategy
   based on Spatial Index when available; no more than this.

- developing and maintaining a specific QGIS/spatialite data
   provider surely was a required step several years ago, when
   spatialite had very limited diffusion; and effectively at the
   time the "with-internal-spatialite" option made easy deploying
   in the most painless way QGIS and SpatiaLite both on Linux
   and Windows.

- anyway, many things have changed during the years; spatialite
   is now widely supported by many mainstream distributions,
   the initial "naive" approach based on internal static
   linkage has revealed in the meanwhile all its intrinsic
   limitations; the scenario is significantly changed since
   then.

- on the other hand it's indisputably true that now GDAL
   effectively supports a first class SpatiaLite driver,
   including full Spatial Index support (the most critical
   feature in the QGIS own perspective).
   from a developer/maintainer perspective updating and
   eventually patching a GDAL/OGR driver is surely simpler
   and easier than updating the corresponding QGIS data
   provider (first because GDAL has no frills connected
   to GUI elements such as dialog boxes and alike, and
   second because GDAL has rock-stable APIs reasonably
   immune form sudden changes).

- all this considered, completely removing the specialized
   QGIS/spatialite data provider so to perform a full switch
   to the GDAL driver could effectively represent a radical
   but very effective solution.
   I easily foresee the following possible advantages:
     * strong simplification / elimination of any not strictly
       indispensable component.
     * easier maintenance / faster version update cycle
     * enhanced stability / more robustness
     * better overall platform consistency (many further
       components as e.g. MapServer, or may be Python plugins
       as well, would then use the same identical GDAL driver).
   And I'm unable foresee any possible disadvantage, because
   the GDAL driver will effectively support anything required
   by QGIS in order to read/write spatial features from a
   SpatiaLite db-file.

bye Sandro



More information about the Qgis-developer mailing list