[gdal-dev] GeoPackage Open() Performance

Even Rouault even.rouault at spatialys.com
Thu Mar 22 02:21:06 PDT 2018


Kevin,

> I've been working on some large GeoPackage files that have a lot of layers.
> One in particular has over 5000 layers. It was taking a good 5+ minutes
> just to open, so I did some analysis. Here's what I found:
> 
> By far the biggest hit is in a query to sqlite_master (called
> from OGRGeoPackageTableLayer::ReadTableDefinition()). This is the
> particular query (in my dataset, this query generally takes about 90ms):
> 
> "SELECT type FROM sqlite_master WHERE lower(name) = lower('%q') AND type "
>             "IN ('view', 'table')"
> 
> The calls to lower means it has to do a full table scan, but there isn't an
> index on sqlite_master, so just removing it doesn't change anything right
> away (it's still a table scan). I was able to optimize this by creating a
> temp table from sqlite_master in GDALGeoPackageDataset::Open(), updating it
> so name is all lower case, then creating an index on name. The one time
> cost of this process was about 100ms, but when I query instead of
> sqlite_master, it shaved down each query to <1ms. With 5000+ layers, this
> cut several minutes off time to open a dataset.

You should re-try with GDAL trunk. I've had this same issue a few weeks/months 
ago and this should be fixed now per
https://trac.osgeo.org/gdal/changeset/40474
(please use latest trunk and not only this patch, as I somehow remember later 
fixes were needed)

The reason for this lower case comparison if I remember well was to add 
compatibility with GeoPacakage databases created by ArcMap where there was 
inconsistent case used.
See https://trac.osgeo.org/gdal/ticket/6916

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list