[gdal-dev] GDALRasterBand::RasterIO c++ vs BandReadAsArray python performance

Sean Gillies sean at mapbox.com
Wed May 11 10:41:39 PDT 2016


Hi Gareth,

How are you initializing your dataTmp array? In my Rasterio project, I've
found that numpy.empty() is the fastest array allocator and use it whenever
possible. I also use the GDAL C API and Cython (in case you're interested:
https://github.com/mapbox/rasterio/blob/c80b568903ef7b902ce6254a42c73af9ddcc8362/rasterio/_io.pyx#L58-L69)
and find the performance to be as good as ReadAsArray.

On Wed, May 11, 2016 at 10:53 AM, Gareth James Jones [gjj12] <
gjj12 at aber.ac.uk> wrote:

> I'm currently writing optimisations for a raster viewer program which uses
> gdal as it's base. It's currently written purely in python, and has some
> major speed issues which cause problems when we are reading many files at a
> time. After making some optimisations in the python, and getting quite a
> minimal speed increase, I proceeded to profile the program quite heavily
> and found that our getImage method was our slowest call. I had already
> performed some optimisations on this function so decided to write a
> C-Extension so that we could get some speed increases through a lower level
> language.
>
> This has worked for the most part, however there is still one issue, we
> have found a speed increase of ~2s for some of our larger files in the bulk
> of the code. But this is negated by the GDALRasterIO call, which is
> actually about 3s slower than the python ReadAsArray.
>
> This doesn't make any sense to me as ReadAsArray is a wrapper around a C++
> call to GDALRasterIO, and thus should be slower than having a call straight
> to GDALRasterIO.
>
> I was hoping someone here might know of a way to read the rasters more
> efficiently. I have tried to implement a method using ReadBlock rather than
> RasterIO, but due to the replication that RasterIO does it didn't work at
> all. (I'm currently trying to figure out a way to do that replication
> without losing too much speed).
>
> The RasterIO call i'm using is
>
> band->RasterIO(GF_Read, this->ovleft, this->ovtop, this->ovxsize,
>                this->ovysize, dataTmp, this->ovxsize, this->ovysize,
>                band->GetRasterDataType(), 0, 0);
>
> The old python call was:
>
> dataTmp = band.ReadAsArray(ovleft, ovtop,
>     ovxsize, ovysize,
>     dspRastXSize, dspRastYSize)
>
>
> Thanks in advance
>
> Gareth Jones
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>



-- 
Sean Gillies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20160511/46be7326/attachment.html>


More information about the gdal-dev mailing list