[Gdal-dev] python arrays - one step further

Alessandro Amici alexamici at fastwebnet.it
Fri Feb 6 09:10:12 EST 2004


folks,

the attached patch serie is meant to further optimize the gdal python layer. 
up to now i only concentrated on raster read performance.

i had to extend a couple of intenal interfaces and (optionally) an external 
interface, so i didn't apply the patches to cvs myself, and wait for Frank 
approval. however the optimization applies to the current Numeric 
implementation (no need of numarray) and it is fully backward compatible.

00_gdalreadraster-buffer.diff
py_GDALReadRaster accepts an additional optional argument that can be any 
python object that offer a writable buffer (of the correct size) via the 
buffer interface, i.e. an array object.

10_gdalnumeric-buf_obj.diff
BandReadRaster get advantage of the new GDALReadRaster argument.

with these two patches applied and no change to my test program and sample 
dataset i obtained:

real    0m23.451s
user    0m17.861s
sys     0m4.667s

broken down as:

CPU: Athlon, speed 1000.15 MHz (estimated)
vma      samples  %           app name                 symbol name
001323c0 15428    33.2221     libgdal.so.1.1.9         GDALCopyWords
00077370 13146    28.3081     libc-2.3.2.so            memset
c01b2450 4346      9.3585     vmlinux                  __copy_to_user_ll
00077990 2713      5.8421     libc-2.3.2.so            memcpy
c01b1f30 2557      5.5061     vmlinux                  fast_clear_page
001322e0 1746      3.7598     libgdal.so.1.1.9         GDALSwapWords

remember that we went from 36.5s to 28.5s removing a unneeded memcpy. this 
further speed up to 23.5s is apparently due to the fact that clearing the 
memory (inside zeros()) is faster that copying it over (inside fromstring()).

but it is not finished, fasten your seat belts!

20_gdal-band-buf_obj.diff
Band.ReadAsArray() accepts an additional optional argument where the result is 
stored (much like the output argument of the ufuncions).

in order to take advantage of the output argument you need to adapt your 
applications, but if you need to read the image in chunks and you allocate a 
reusable buffer array, this is where you can get:

real    0m14.506s
user    0m11.597s
sys     0m2.617s

and:

vma      samples  %           app name                 symbol name
001323c0 14990    51.6362     libgdal.so.1.1.9         GDALCopyWords
c01b2450 4419     15.2222     vmlinux                  __copy_to_user_ll
00077990 2796      9.6314     libc-2.3.2.so            memcpy
001322e0 1714      5.9042     libgdal.so.1.1.9         GDALSwapWords
00077370 1198      4.1268     libc-2.3.2.so            memset

note that the memcpy is not due to python but it is called inside 
GDAL_CopyWords, and the only further overhead due to Numeric is the (not 
really useful) memset. with numarray we could avoid that and a couple of code 
contorions, oh well.

anybody interested, please test, but...
a word of warning: this is my first time playing with python modules internal!

Frank, something like that needs to go in, do you have any comment?

cheers,
alessandro
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 00_gdalreadraster-buffer.diff
Type: text/x-diff
Size: 2027 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/gdal-dev/attachments/20040206/75d31057/00_gdalreadraster-buffer.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 10_gdalnumeric-buf_obj.diff
Type: text/x-diff
Size: 1448 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/gdal-dev/attachments/20040206/75d31057/10_gdalnumeric-buf_obj.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 20_gdal-band-buf_obj.diff
Type: text/x-diff
Size: 805 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/gdal-dev/attachments/20040206/75d31057/20_gdal-band-buf_obj.bin


More information about the Gdal-dev mailing list