[gdal-dev] Raster calculation optimalisation

Wed Jun 14 00:29:22 PDT 2017

Damian Dixon wrote
> It is usually better to process the pixels in a block rather than walking
> across each row especially if you are processing a TIFF as these are
> usually stored as tiles (blocks).

Other layouts are common as well. For example, the Landsat TIFF's provided
by the USGS have a row based-layout.  If you can choose it yourself, bocks
are prefered in my opinion, since GDAL VRT's have a fixed blocksize
of128*128. So when writing TIFF's, setting "TILED=YES" is a good default. 

I think your spot on by mentioning the blocks. Don't assume the layout at
all, look at the blocksize of the file and use it. If the blocks are
relatively small (memoy-wise), using a multiple of the size can increase
performance a bit more. So if its row-based and you have plenty of memory to
spare, why not read blocks of (xsize, 128). Or if the blocksize is 128x128,
use blocks of 256x256 etc.

If the volume of data is really large, increasing GDAL's block cache can be
helpful. Although its best to avoid relying on the cache (if possible) by
specifying an appropriate blocksize.

Here are a few mini-benchmarks:
http://nbviewer.jupyter.org/gist/RutgerK/4faa43873ee6389873dd4de85de7c450

https://stackoverflow.com/questions/41742162/gdal-readasarray-for-vrt-extremely-slow/41853759#41853759

Regards,
Rutger

--
View this message in context: http://osgeo-org.1560.x6.nabble.com/gdal-dev-Raster-calculation-optimalisation-tp5324014p5324120.html
Sent from the GDAL - Dev mailing list archive at Nabble.com.