Evan, <div><br></div><div>To ensure that i understand here is an example:</div><div><br></div><div>If I have a GTiff, where the block size is one row by all of the columns (a single scanline), I should try to read in either one scanline at a time, or multiple entire scanlines. It is inefficient to take say 10 rows and only half of the columns.</div>
<div><br></div><div>What if my application requires that I read one entire column by an arbitrary number of scanlines? Essentially reading at a 90 degree angle to the block size. Other an increasing the cache size and flushing the cache, are their other techniques to reduce thrashing (and therefore processing time)?</div>
<div><br></div><div>J</div><div><br><div class="gmail_quote">On Wed, Aug 3, 2011 at 11:19 AM, Even Rouault <span dir="ltr"><<a href="mailto:even.rouault@mines-paris.org">even.rouault@mines-paris.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Le mercredi 03 août 2011 17:32:53, Antonio Valentino a écrit :<br>
<div class="im">> Hi Jay,<br>
><br>
> Il 03/08/2011 16:53, Jay L. ha scritto:<br>
> > I have been working on this problem as well. Initially, the attempt was<br>
> > to ReadAsArray small chunks. Unfortunately this is quite inefficient.<br>
> > Someone more knowledgeable will know why, but I suspect it has to do<br>
> > with either thrashing or the fact that full blocks are not being read in<br>
> > (as is the case when a 5000x5000 pixel block is read in on a 12567,<br>
> > 12764 GTiff).<br>
><br>
> Yes, using chunks that are too small can cause inefficiency, and yes<br>
> using blocks as that are aligned (exact size of multiple size) to I/O<br>
> blocks is a good idea whenever it is possible.<br>
<br>
</div>Yes I strongly concurr with that. Reading 5000x5000 in a 12567x12764 raster is<br>
likely to be inefficient if the raster is scanline oriented, that is to if the<br>
say the dimension of a bock reported by gdalinfo or GetBlockSize() is 12567x Y<br>
rows. In such as situation you should try to read chunks of Y (or a multiple<br>
of Y) whole lines.<br>
<br>
Another point to take into consideration is when you read a multiband dataset.<br>
If the data in the dataset is pixel interleaved, then you should try to read<br>
all the bands at a time with DatasetRasterIO() so that GDAL avoids re-reading<br>
from disk the same blocks for each band. On the contrary, if the data is band<br>
interleaved, reading band by band is OK (using DatasetRasterIO() too because<br>
it will detect and adapt itself to the data organization to select the best<br>
algorithm).<br>
<br>
There are other possible caveats depending on the file format itself. For<br>
example if you read a JPEG, PNG or GIF image, you must know that you cannot<br>
read back lines without causing decompression to be restarted from the top<br>
line. But such formats are rarely used for that big images. I somehow remember<br>
that it is also the case for some formulations of HDF4 (<br>
<a href="http://trac.osgeo.org/gdal/ticket/3386" target="_blank">http://trac.osgeo.org/gdal/ticket/3386</a> ).<br>
<br>
You can check if your way of reading is efficient or not by defining CPL_DEBUG=ON<br>
and look at the warnings. If you see something about "Potential thrashing on<br>
band XXX of YYY", it is a hint that you didn't employ the most efficient reading<br>
scheme.<br>
<div class="im"><br>
><br>
> I don't know very well internals of the python binding implementation.<br>
> Looking at the release notes it seems that some important change in this<br>
> are as been don in release 1.8.0<br>
><br>
> <a href="http://trac.osgeo.org/gdal/wiki/Release/1.8.0-News#SWIGLanguageBindings" target="_blank">http://trac.osgeo.org/gdal/wiki/Release/1.8.0-News#SWIGLanguageBindings</a><br>
<br>
</div>Yes there have been a few optimizations to save some useless temporary buffer<br>
copies, and a few fixes as well. One of them allow to read more than 2GB for 64<br>
bit builds of GDAL.<br>
<br>
Regards,<br>
<font color="#888888"><br>
Even<br>
</font></blockquote></div><br></div>