[gdal-dev] Gdalinfo slow with big Rasterlite tables

Even Rouault even.rouault at mines-paris.org
Sat Aug 18 07:43:34 PDT 2012


Jukka,

> 
> Is gdalinfo perhaps walking through every single tile in the
> rasterlite table for gathering the image layer info? Could
> there be any other way to do it in more effective way on the
> GDAL side?

The Rasterlite driver needs to fetch the extent of the "XXXX_metadata" layers 
to establish the extent of the raster XXXX, which might take a long time when 
there are a lot of tiles.

> 
> When it comes to GDAL, could it make any sense to cache
> gdalinfo from Rasterlite layers? Three minutes is rather a
> long time and my 153600 x 249600 pixel sized layer with
> 780270 rows/tiles in 5 meter resolution in the table is
> not exceptionally big. If time is increasing with tile
> count it would mean 12 minutes for getting gdalinfo from
> 2.5 meter resolution and 48 minutes from 1.25 minutes
> layer...
> 

Funny because independantly of the issue you raise here, I was working on 
improving the performance of GetFeatureCount() and GetExtent() on Spatialite 
DBs. In Spatialite 3.0, there is a SQL function triggered by "SELECT 
UpdateLayerStatistics()" that creates a "layer_statistics" table that cache 
those both row count and extent.

I've just pushed an improvement (r24800) in which the SQLite driver can use 
those cached values, if they are up-to-date. The up-to-dateness is determined 
by checking that the timestamp of the last 'UpdateLayerStatistics' event 
recorded in the 'spatialite_history' table matches the timestamp of the file. 
When creating a new Spatialite DB or updating it with the OGR API, the SQLite 
driver makes sure that the statistics are kept up-to-date automatically. 
However, if a third-party tool edits the DB, it is then necessary to run : 
ogrinfo the.db -sql "SELECT UpdateLayerStatistics()". (The driver plays on the 
safe side, and will not use old statistics to avoid getting false results.)

I've just made marginal changes (r24801) in the Rasterlite driver so that the 
above caching mechanism works automatically in simple gdal_translate and 
gdaladdo scenario. I would expect that it might solve your performance 
problem, although I have not checked that.

You can check that statistics are used by issuing :

$ gdalinfo byte.sqlite --debug on

SQLITE: SpatiaLite-style SQLite DB found !
SQLITE: File timestamp matches layer statistics timestamp. Loading statistics 
for byte_metadata
SQLite: Layer byte_metadata feature count : 2
SQLite: Layer byte_metadata extent : 440720.0,3750120.0,441920.0,3751320.0
OGR: OGROpen(byte.sqlite/0x1ad6390) succeeded as SQLite.
[...]

Best regards,

Even


More information about the gdal-dev mailing list