[gdal-dev] RE: progressive rendering

Fri Aug 29 08:11:54 EDT 2008

Norman,

Honestly I didn't follow the observations that have turned the things
into a different approach, however I agree that doing callback on
different threads might not be the most reliable option in some cases.

By reviewing the current proposal I'm a bit uncertain about how the
objective described there is related to to the title of the document.
It seems like we are tending to swich to a sequential streaming
approach instead of providing an interface to the possible progressive
rendering modes. As far as I can see we would like to notify the user
about the section that have been (or should be) updated in the buffer,
then the user is responsible to put the image together from the chunks
in order to visually render it on the screen or do some other
interesting job with that.
Assuming that we would indeed like to support the progressive
rendering, in addition to the top-down image streaming mode we might
also consider that the data may be available in such order that may
not be described easily in the interface. For example i can imageine a
one-dimensional interlacing scheme where the every 8th, 4th, 2th.. row
arrives in the subsequent iterations. Or in a 2D interlacing mode the
resolution of the image may be enhanced during the iterations. How can
we describe the modified sections in these more esoteric cases? From
the user's perspective I would rather like to see a fully prepared
intermediary image to be created by the driver in every stage of the
process. User would provide the same buffer to the driver but the
diver would be responsible to modify the data incrementally according
to the incoming segments. In some cases the whole buffer may be
modified in every iteration.

The driver should also handle the required downsampling of the rasters
according to the requested resolution of the image, how this have been
addressed with the current proposal? How the driver would prepare a
1000x1000 image into a buffer having 300*300 pixels. Would the driver
store the whole image in an in-memory dataset and rely on the current
RasterIO to copy the data to the user for example?

It seems we would like to serialize the incoming data into a message
queue by using the same chunk structure as it have been received by
the driver, however from the user's perspective it's not too
interesting to receive the data in the same fragments as it have been
received. For example the user might want to set up a timer and render
the actual snapshot of the image in every 100ms. Therefore the driver
would be responsible to put the data together by collecting every
segments that have been arrived in the meantime. From this aspect
there's no need to collect every segment in a message queue, the
driver may also use an internal buffer to collect the data between the
stages and present the whole data together to the user.

We have switched the previous pattern to another because we afraid of
the negative impacts of the multiple threads. But do we really need to
use multiple threads at all? Does the driver need to read the incoming
buffers of the socket as soon as the data have been arrived and
provide some 'real-time' action afterwards? I assume the socket
library will safely buffer the incoming data and the transmission
control protocol will be able to pause the transfer if this buffer
will eventually become full temporarily. Therefore I guess it would be
sufficient for the driver to do all the action (like reading and
preprocessing the TCP buffers) only inside the RasterIO related
functions called by the client.

In the current approach we are using a fair amount of new functions at
the interface level and rename the existing RasterIO to
ProgressiveRasterIO which is quite annoying. At the moment I don't see
the real benefit of using those new functions instead of using the
existing one in this special way.
Related to the statements above - in my opinion - only the behaviour
of the existing RasterIO should be altered, which would provide an
alternative option (in a Win32 overlapped IO fashion), according to
the following example:

1. During the dataset creation the asynchronous behaviour of the
driver for this dataset could be specified as a new dataset creation
option (like RASTERIO_MODE=ASYNC)

2. The user would use the existing RasterIO method to initiate the
operation as well as to fetch the next available data. The RasterIO
would return immediately if the buffer contains a proper set of the
intermediary data. In this case RasterIO would return a special error
code (IO_PENDING) to denote that more data will be available and the
user will have to initiate a subsequent call to RasterIO. (If we don't
like to introduce new error codes we could also use a separate
function like GetAsyncResult for this purpose).

3. The RasterIO  (or GetAsyncResult) would return no error if the
operation have been finished and no more new update will be available
in the buffer.

4. We could introduce a separate function like CancelIO to allow the
user to cancel the pending IO operation at any time.

I admit I might have missed something related to the aims and the
functionality that the proposal may provide, but I would prefer a
simple and consistent interface instead of a brand new API to be
handled in a different way by the user. Moreover it would also be
beneficial if not only one function could be 'asynchronized', but the
framework would provide the option to bring more API functions like
GetHistorgam or ReprojectImage into this scope in the future.

Best regards,

Tamas

2008/8/28 Norman Barker <nbarker at ittvis.com>:
> Hi Adam, Tamas, Even, all
>
> I have updated the RFC
>
> http://trac.osgeo.org/gdal/wiki/rfc24_progressive_data_support
>
> And completely changed the pattern used to reflect the general consensus
> to use an asynchronous queue for communication between threads.
>
> Can you comment on this, and let me know if it is acceptable?
>
> Can we iterate this is a few times, and then how is this RFC approved
> (or rejected!)?
>
> Many thanks,
>
> Norman
>
> -----Original Message-----
> From: Adam Nowacki [mailto:nowak at xpam.de]
> Sent: Thursday, August 28, 2008 10:50 AM
> To: Even Rouault
> Cc: Norman Barker; gdal-dev at lists.osgeo.org
> Subject: Re: [gdal-dev] RE: progressive rendering
>
> Even Rouault wrote:
>> I don't know JPIP, but I can image that the driver would start a
>> thread when
>> AsyncRasterIO() is called. It communicates with the server and
>> receives the updates with a polling loop. When it has received an
>> update,it put the received data as well as the parameters describing
>> the window, etc... in a structure (let's call it a ticket), pushes
>> that ticket in a stack and goes on pushing tickets, or wait for the
>> ticket to be consumed by the reader (both are possible, even if you
>> can't push continuously new tickets as memory will increase, so the
>> working thread would have to go in idle mode until the queue decreases
>
>> a bit)
>>
>> The NextAsyncRasterIOMessage() call will check that some message is
>> available and unstack the first ticket. In fact, the LockBuffer() /
>> UnlockBuffer() could probably be avoided at the API level. Of course
>> the implementation of
>> NextAsyncRasterIOMessage() needs an internal mutex to protect the
>> accesses to the queue.
>
> My idea was to update the data buffer given to AsyncRasterIO immediately
> after receiving data and write only window coordinates into the queued
> messages. That way the queue will remain small, a few KB's at most. This
> is also why LockBuffer() / UnlockBuffer() is there, to protect the
> buffer from async updates while we read from it. LockBuffer(xoff, yoff,
> xsize, ysize) allows almost no wait operation if used with coords from
> queue.
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>