[Gdal-dev] RE: RasterIO tile size issue

Sheykhet, Rostic rsheykhet at sanz.com
Thu Feb 16 18:04:04 EST 2006


Frank,

	Though it is probably a significant departure from the current
algorithm, what came to my mind is a solution which instead of writing
the passed-in buffer line-by-line, writes it image-block-by-image-block.
So instead of writing line #1 across all the affected image blocks, it
would write all relevant pieces of all relevant lines onto the block #1,
then flush it to disk and never need to access it again in this
particular RasterIO() operation.

	The advantage, in my opinion, is that the blocks are written
completely one-by-one resulting in a very small memory footprint. The
tile cache is also not used at all, leaving more space in it for other
tiles.  	However, as mentioned above, and as you probably can
see, this would require writing a rather involved piece of code.

	In general, I would be very happy no matter what solution to the
problem is implemented.

Thanks a lot for your help

Rostic


 
-----Original Message-----
From: Frank Warmerdam [mailto:fwarmerdam at gmail.com] On Behalf Of Frank
Warmerdam
Sent: Thursday, February 16, 2006 3:35 PM
To: Sheykhet, Rostic
Cc: gdal-dev at lists.maptools.org
Subject: Re: RasterIO tile size issue

Sheykhet, Rostic wrote:
> Hi Frank, All:
> 
> 	I have found an issue with GDALRasterIO() when dealing with very
> big(long) images and/or very small tile cache.  The problem seems to
> occur when RasterIO() attempts to write an area of a certain width
(say
> AW) to an image with a smaller tile width (say BW), and the tile cache
> is not large enough to keep ceil(AW/BW) blocks cached.
> 	Assuming that the tile cache is completely turned off, here is
> the scenario that I am observing.  Passed-in buffer size = 128x128,
> image tile size=64x64.  RasterIO() starts writing the first line of
the
> passed in buffer, acquires the first 64x64 block B[0,0] of the image
and
> writes the first 64 samples to it.  Then it needs to continue writing
> the left-over 64 bytes of same line and therefore acquires the next
> block B[1,0] (to the right) and flushes B[0,0] block to disk.  After
> writing the last 64 samples of this scanline, it moves on to writing
the
> second scan line, at which point the block B[0,0] is required again.
At
> this point, the code at line 286 of rasterio.cpp is executed:
>                 int bJustInitialize = 
>                     eRWFlag == GF_Write
>                     && nYOff <= nLBlockY * nBlockYSize
>                     && nYOff + nYSize >= (nLBlockY+1) * nBlockYSize
>                     && nXOff <= nLBlockX * nBlockXSize
>                     && nXOff + nXSize >= (nLBlockX+1) * nBlockXSize;
> 
> This code checks whether or not the block is completely contained
within
> 
> the area covered by the passed in buffer.  If so, the code simply
> allocates
> a new tile without reading the data that has already been written to
> disk.
> 
> I think that a check should be put in for figuring out whether this
> RasterIO call has already written to this block, and the block should
be
> read from disk, if that's the case.

Rostic,

I see your point.

However, even with the fix you suggest performance will still be just
terrible
as blocks are swapped in and out for each scanline processed.  Instead,
I
think I will try and come up with an approach where I hold all the
blocks
"locked" till all the scanlines being written to them are complete.
This will
resolve a pathological performance case and resolve this bug all at
once.

Does that seem reasonable to you?

Best regards,
-- 
---------------------------------------+--------------------------------
------
I set the clouds in motion - turn up   | Frank Warmerdam,
warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent





More information about the Gdal-dev mailing list