[gdal-dev] BandReadAsArray speed up

Frank Warmerdam warmerdam at pobox.com
Mon Feb 23 20:19:58 EST 2009


Gong, Shawn (Contractor) wrote:
> hi list,
> 
> 
> I am reading the image data for a bandobj, using BandReadAsArray() which
> works, but it is sometimes slow. Is it actually reading from the file?  Is
> there a way to access the image data that is already in memory, or anything
> else I could do to speed it up?

Shawn,

BandReadAsArray() calls the GDALRasterBand::RasterIO() function which will
normally first search the block cache for information before actually
asking the driver for data though this can be overridden by some driver
implementations.  Often slow read data access against large datasets is
caused by cache thrashing.  For instance if you ask for scanlines from a
tile oriented format, but a row of tiles in the block cache add up to more
memory than is set aside for the block cache, then by the time the last tile
for the scanline is read, the first will be discarded.  Thus subsequent
scanline requests will cause all the data to be re-read from the underlying
driver.

If this is what is happening, you can improve things by upping the block
cache size (GDAL_CACHEMAX config options or there is also a specific set call
in python for this).

There are many other possible reasons for poor performance too.

I will note that the extra array allocation that Howard mentions will only
make a very modest difference in performance.

Best regards,

-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent



More information about the gdal-dev mailing list