[gdal-dev] Gdalinfo slow with big Rasterlite tables
Even Rouault
even.rouault at mines-paris.org
Sat Aug 18 07:43:34 PDT 2012
Jukka,
>
> Is gdalinfo perhaps walking through every single tile in the
> rasterlite table for gathering the image layer info? Could
> there be any other way to do it in more effective way on the
> GDAL side?
The Rasterlite driver needs to fetch the extent of the "XXXX_metadata" layers
to establish the extent of the raster XXXX, which might take a long time when
there are a lot of tiles.
>
> When it comes to GDAL, could it make any sense to cache
> gdalinfo from Rasterlite layers? Three minutes is rather a
> long time and my 153600 x 249600 pixel sized layer with
> 780270 rows/tiles in 5 meter resolution in the table is
> not exceptionally big. If time is increasing with tile
> count it would mean 12 minutes for getting gdalinfo from
> 2.5 meter resolution and 48 minutes from 1.25 minutes
> layer...
>
Funny because independantly of the issue you raise here, I was working on
improving the performance of GetFeatureCount() and GetExtent() on Spatialite
DBs. In Spatialite 3.0, there is a SQL function triggered by "SELECT
UpdateLayerStatistics()" that creates a "layer_statistics" table that cache
those both row count and extent.
I've just pushed an improvement (r24800) in which the SQLite driver can use
those cached values, if they are up-to-date. The up-to-dateness is determined
by checking that the timestamp of the last 'UpdateLayerStatistics' event
recorded in the 'spatialite_history' table matches the timestamp of the file.
When creating a new Spatialite DB or updating it with the OGR API, the SQLite
driver makes sure that the statistics are kept up-to-date automatically.
However, if a third-party tool edits the DB, it is then necessary to run :
ogrinfo the.db -sql "SELECT UpdateLayerStatistics()". (The driver plays on the
safe side, and will not use old statistics to avoid getting false results.)
I've just made marginal changes (r24801) in the Rasterlite driver so that the
above caching mechanism works automatically in simple gdal_translate and
gdaladdo scenario. I would expect that it might solve your performance
problem, although I have not checked that.
You can check that statistics are used by issuing :
$ gdalinfo byte.sqlite --debug on
SQLITE: SpatiaLite-style SQLite DB found !
SQLITE: File timestamp matches layer statistics timestamp. Loading statistics
for byte_metadata
SQLite: Layer byte_metadata feature count : 2
SQLite: Layer byte_metadata extent : 440720.0,3750120.0,441920.0,3751320.0
OGR: OGROpen(byte.sqlite/0x1ad6390) succeeded as SQLite.
[...]
Best regards,
Even
More information about the gdal-dev
mailing list