[gdal-dev] RE: progressive rendering

Fri Aug 29 09:48:08 EDT 2008

Ill talk about my original proposal 
(http://lists.osgeo.org/pipermail/gdal-dev/2008-August/018088.html) 
instead of the one on trac 
(http://trac.osgeo.org/gdal/wiki/rfc24_progressive_data_support).

Tamas Szekeres wrote:
> Honestly I didn't follow the observations that have turned the things
> into a different approach, however I agree that doing callback on
> different threads might not be the most reliable option in some cases.

> By reviewing the current proposal I'm a bit uncertain about how the
> objective described there is related to to the title of the document.
> It seems like we are tending to swich to a sequential streaming
> approach instead of providing an interface to the possible progressive
> rendering modes. As far as I can see we would like to notify the user
> about the section that have been (or should be) updated in the buffer,
> then the user is responsible to put the image together from the chunks
> in order to visually render it on the screen or do some other
> interesting job with that.

While this seems to be the case for Norman's proposal it is not in mine. 
A image buffer is provided to the AsyncRasterIO call, same as it would 
be with normal RasterIO call. This buffer is later asynchronously 
updated as more data is received. A AsyncRasterIO call followed by a 
loop waiting for GARM_COMPLETE message would be equal to a normal 
RasterIO call.

> Assuming that we would indeed like to support the progressive
> rendering, in addition to the top-down image streaming mode we might
> also consider that the data may be available in such order that may
> not be described easily in the interface. For example i can imageine a
> one-dimensional interlacing scheme where the every 8th, 4th, 2th.. row
> arrives in the subsequent iterations. Or in a 2D interlacing mode the
> resolution of the image may be enhanced during the iterations. How can
> we describe the modified sections in these more esoteric cases? From
> the user's perspective I would rather like to see a fully prepared
> intermediary image to be created by the driver in every stage of the
> process. User would provide the same buffer to the driver but the
> diver would be responsible to modify the data incrementally according
> to the incoming segments. In some cases the whole buffer may be
> modified in every iteration.

This is exactly what i have in mind. If we receive every 8th row the 
driver could update a whole 8 row block by copying the row 8 times.

> The driver should also handle the required downsampling of the rasters
> according to the requested resolution of the image, how this have been
> addressed with the current proposal? How the driver would prepare a
> 1000x1000 image into a buffer having 300*300 pixels. Would the driver
> store the whole image in an in-memory dataset and rely on the current
> RasterIO to copy the data to the user for example?

This is really internal to the driver. If the format has reduced 
resolution sets they should be used instead of the highest resolution image.

> It seems we would like to serialize the incoming data into a message
> queue by using the same chunk structure as it have been received by
> the driver, however from the user's perspective it's not too
> interesting to receive the data in the same fragments as it have been
> received. For example the user might want to set up a timer and render
> the actual snapshot of the image in every 100ms. Therefore the driver
> would be responsible to put the data together by collecting every
> segments that have been arrived in the meantime. From this aspect
> there's no need to collect every segment in a message queue, the
> driver may also use an internal buffer to collect the data between the
> stages and present the whole data together to the user.

In my proposal the user could:
1) LockBuffer()
2) display the current snapshot
3) UnlockBuffer()
4) sleep for 100ms ignoring the update messages (but 
NextAsyncRasterIOMessage() still has to be called to get the 
GARM_COMPLETE message)

> We have switched the previous pattern to another because we afraid of
> the negative impacts of the multiple threads. But do we really need to
> use multiple threads at all? Does the driver need to read the incoming
> buffers of the socket as soon as the data have been arrived and
> provide some 'real-time' action afterwards? I assume the socket
> library will safely buffer the incoming data and the transmission
> control protocol will be able to pause the transfer if this buffer
> will eventually become full temporarily. Therefore I guess it would be
> sufficient for the driver to do all the action (like reading and
> preprocessing the TCP buffers) only inside the RasterIO related
> functions called by the client.

My proposal also supports single threaded implementations. AsyncRasterIO 
would initialize the request while subsequent NextAsyncRasterIOMessage() 
calls would communicate with the server and update image buffer. 
NextAsyncRasterIOMessage() would either return as soon as possible, 
after all the data received since last call is processed or block 
waiting for more data to arrive with a optional timeout (so a slight 
change of my proposed interface).

// timeout < 0 : wait indefinitely for a new message
// timeout = 0 : no wait, return as soon as possible with or without new 
messages
// timeout > 0 : wait at most timeout milliseconds
GDALAsyncRasterIOMessage *NextAsyncRasterIOMessage(int timeout);

> In the current approach we are using a fair amount of new functions at
> the interface level and rename the existing RasterIO to
> ProgressiveRasterIO which is quite annoying. At the moment I don't see
> the real benefit of using those new functions instead of using the
> existing one in this special way.

The RasterIO function is not renamed. A new function is added 
(AsyncRasterIO) with different behavior than the current RasterIO 
function. RasterIO function doesnt change, a full backwards compatibility.

> Related to the statements above - in my opinion - only the behaviour
> of the existing RasterIO should be altered, which would provide an
> alternative option (in a Win32 overlapped IO fashion), according to
> the following example:
> 
> 1. During the dataset creation the asynchronous behaviour of the
> driver for this dataset could be specified as a new dataset creation
> option (like RASTERIO_MODE=ASYNC)

With my proposal you can mix normal blocking RasterIO calls with 
AsyncRasterIO calls.

> 2. The user would use the existing RasterIO method to initiate the
> operation as well as to fetch the next available data. The RasterIO
> would return immediately if the buffer contains a proper set of the
> intermediary data. In this case RasterIO would return a special error
> code (IO_PENDING) to denote that more data will be available and the
> user will have to initiate a subsequent call to RasterIO. (If we don't
> like to introduce new error codes we could also use a separate
> function like GetAsyncResult for this purpose).
>
> 3. The RasterIO  (or GetAsyncResult) would return no error if the
> operation have been finished and no more new update will be available
> in the buffer.
> 
> 4. We could introduce a separate function like CancelIO to allow the
> user to cancel the pending IO operation at any time.

While I like the simplicity there are a few 'problems'. Identifying 
subsequent calls to RasterIO belonging to the same operation. Would it 
be the buffer pointer, together with the offset and size maybe? Possibly 
a lot of variables that have to be remembered by both the driver and 
user side. Searching a list of all open async raster io operations ? 
Would the buffer be updated with data received since last RasterIO call 
or a current snapshot ? Each open async raster io would also require its 
own data buffer, later copied in whole (or only updated regions) into 
user buffer.