[gdal-dev] Improving usage of GDAL in Mapnik
Even Rouault
even.rouault at spatialys.com
Thu Oct 30 17:01:45 PDT 2014
Le jeudi 30 octobre 2014 20:55:11, Robert Coup a écrit :
> Hi team,
>
> Fixing a bug in the Mapnik GDAL code (
> https://github.com/mapnik/mapnik/tree/master/plugins/input/gdal) where the
> return value of RasterIO() isn't checked (so eg. if a component TIFF in a
> VRT is unavailable it just trucks right along rendering blank images
> without error) and it appears to me like the usage of GDAL could be
> optimised a bit in
> https://github.com/mapnik/mapnik/blob/master/plugins/input/gdal/gdal_featur
> eset.cpp
>
>
> I'm curious whether any of the following will actually make much
> difference, or whether GDALs caching nullifies most of it already? Or
> whether there are potentially other obvious performance optimisations?
>
> * Favouring dataset.RasterIO() over band.RasterIO() to prevent up to 4x
> raster reads
Rob,
If the size of the image (or your requested window) is below the GDAL block
cache maximum size (default is 40 MB), then most drivers should cache the data
for other bands when reading one band (particularly if decompression is
involved), if the storage on disk is pixel interleaved (i.e R1,G1,B1,R2,G2,B2,
etc...), which is generally the case for GeoTIFF. So reading in one call or 4
calls will probably not make a lot of difference.
But if you know that you have to read the 4 bands, there's no reason not to
use dataset.RasterIO() in all situations. GDAL, either in its generic
implementation or in the driver specific one, will look at the declared
interleaving and will/should select the most appropriate reading strategy, ie.
potentially operating block by block if the data is pixel interleaved, or band
by band otherwise.
> * Use of GetBlockSize() to read on block boundaries, then cropping data
> later.
I've not looked closely at the code you point to, but that depends on what you
do. If you have allocated a big buffer to contain the whole requested window,
then just issue a single RasterIO() and let GDAL do its job to filling it the
best way.
If you just allocate small chunks of memory and process chunk by chunk, then
it is indeed better to call RasterIO() on the block boundaries (or multiple of
block boundaries).
> * Use GetMaskBand()/GetMaskFlags() to simplify nodata/alpha handling
> (probably won't impact performance but might get rid of a bunch of code)
Yes that can be convenient to avoid having to deal both with nodata and alpha
band in a unified way. Plus some formats can really have a mask band that is
neither of the two. For example you can create a JPEG-compressed YCbCr TIFF
and use an internal mask band to encode transparency (although this is
admitedly quite an exotic TIFF formulation because most readers will ignore
the transparency mask)
Even
>
> Cheers,
>
> Rob :)
--
Spatialys - Geospatial professional services
http://www.spatialys.com
More information about the gdal-dev
mailing list