[Gdal-dev] trivial (?) optimization for gdal.i:GDALReadRaster()
Alessandro Amici
alexamici at fastwebnet.it
Wed Feb 4 06:38:01 EST 2004
Frank,
i've started analysing the performance of my gdal-related pyhton code with
oprofile (http://oprofile.sf.net) and found what looks like an easy
optimization for a ~20% speed increase on reads.
i have a simple python script that reads a 500Mb CInt16 dataset in ~40Mb
chuncks. the file fits confortably in the linux kernel cache on my machine
and i get quite repeatably the following timing on an otherwise idle system:
real 0m36.355s
user 0m26.925s
sys 0m8.593s
oprofile breaks down the cpu time as:
alan:$ opreport -t 1 -l
CPU: CPU with timer interrupt, speed 1000.06 MHz (estimated)
Profiling through timer interrupt
vma samples % app name symbol name
001323b0 8035 22.1380 libgdal.so.1.1.9 GDALCopyWords
00077990 7829 21.5705 libc-2.3.2.so memcpy
000772e0 6211 17.1125 libc-2.3.2.so memmove
c01af1b0 4162 11.4671 vmlinux fast_clear_page
00077370 3487 9.6074 libc-2.3.2.so memset
c01af6d0 2266 6.2433 vmlinux __copy_to_user_ll
001322d0 855 2.3557 libgdal.so.1.1.9 GDALSwapWords
c013bd50 412 1.1351 vmlinux buffered_rmqueue
c0117d30 366 1.0084 vmlinux do_page_fault
which says that my script is spending most of it's time basically copying
around the initial data (fetched from the kernel memory by __copy_to_user_ll,
which accounts for only 6% of the total time). the call to GDALCopyWords is
ok because it also does type conversion to CFloat32.
the attached patch (you need to regenerate the gdal_wrap.c file with swig) cut
the real time well below 30 seconds (~20% better) by skipping one of the
copys of the original buffer. the call to PyString_AsString() is legitimate
acording to the python C-API manual because we just created the object.
results are:
real 0m28.409s
user 0m20.557s
sys 0m6.678s
CPU: CPU with timer interrupt, speed 1000.06 MHz (estimated)
Profiling through timer interrupt
vma samples % app name symbol name
001323b0 7959 28.3329 libgdal.so.1.1.9 GDALCopyWords
000772e0 6226 22.1637 libc-2.3.2.so memmove
00077370 3558 12.6660 libc-2.3.2.so memset
c01af1b0 2722 9.6899 vmlinux fast_clear_page
c01af6d0 2236 7.9598 vmlinux __copy_to_user_ll
00077990 1464 5.2116 libc-2.3.2.so memcpy
001322d0 820 2.9191 libgdal.so.1.1.9 GDALSwapWords
memcpy usage dropped by 6300+ samples and also fast_clear_page usage goes down
a bit.
i think i know where the remaining memmove-memset are sitting (see
pytmod/gdalnumeric.py:154), but i have no work-around for that.
this is my first time at playing with python extensions, so Frank please
double check the patch. anyhow, it has been tested and works here.
cheers,
alessandro
BTW: which version of swig are you using? i cannot produce the gdal_wrap.c
file with version 1.3.19.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-read-speed.diff
Type: text/x-diff
Size: 789 bytes
Desc: not available
Url : http://lists.osgeo.org/pipermail/gdal-dev/attachments/20040204/820b72ef/python-read-speed.bin
More information about the Gdal-dev
mailing list