[gdal-dev] GeoPackage Open() Performance
Kevin Bentley
kbentley at cognitics.net
Tue Mar 20 12:14:39 PDT 2018
I've been working on some large GeoPackage files that have a lot of layers.
One in particular has over 5000 layers. It was taking a good 5+ minutes
just to open, so I did some analysis. Here's what I found:
By far the biggest hit is in a query to sqlite_master (called
from OGRGeoPackageTableLayer::ReadTableDefinition()). This is the
particular query (in my dataset, this query generally takes about 90ms):
"SELECT type FROM sqlite_master WHERE lower(name) = lower('%q') AND type "
"IN ('view', 'table')"
The calls to lower means it has to do a full table scan, but there isn't an
index on sqlite_master, so just removing it doesn't change anything right
away (it's still a table scan). I was able to optimize this by creating a
temp table from sqlite_master in GDALGeoPackageDataset::Open(), updating it
so name is all lower case, then creating an index on name. The one time
cost of this process was about 100ms, but when I query instead of
sqlite_master, it shaved down each query to <1ms. With 5000+ layers, this
cut several minutes off time to open a dataset.
I have a patch for this, but I wasn't sure who to submit it to. There are
also several other places in GDAL where queries to sqlite_master can be
optimized, so I wanted to talk about it before I waste time refactoring
other code. Is this a patch the maintainers would be interested in
integrating?
--
Kevin Bentley
Cognitics, Inc.
947 E Winding Creek Drive #200
Eagle, ID 83616-7231
(208) 904-3780 (Office)
(208) 890-4472 (Cell)
(866) 922-2037 (Fax)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20180320/31b1b485/attachment.html>
More information about the gdal-dev
mailing list