[Gdal-dev] Implementing GDALRasterBand::IRasterIO

Wed Jan 14 14:01:50 EST 2004

James Gallagher wrote:
> Hi,
> 
> I would like to add sub-sampling capabilities to the OPeNDAP/GDAL
> driver. To do this I plan on specializing GDALRasterBand::IRasterIO(). 
> 
> First question: is that method the correct place to implement
> sub-sampling?

James,

It depends a bit what you mean by sub-sampling. If you want a client
application to be able to request a sub-area at full resolution and
the driver would be able to fetch just the subarea then overriding
IRasterIO() would be one approach.  Another would be to pretend that
the data is tiled, and just implement the IBlockRead() method.

If you want to offer an efficient way to accessed reduced resolution
images, then you could produce pseudo-overview layers.

However, given that for the OPeNDAP driver you want alot of control
over how many individual requests actually go to the remote server,
I would say overriding IRasterIO() makes sense.

Be aware that:
  o applications normally call RasterIO(), so you will now very rarely
    get block accesses - however, it can still happen.  For instance, I
    think the min/max computation goes through the block API.

  o It is very common for applications to make many small RasterIO()
    calls, often for one scanline at a time.  I often try to recognise
    the scanline request case, and pre-read a bunch of scanlines at once
    and cache them.

> Second question: I noticed that the OGDI driver implemented only the
> GDALRasterBand::IRasterIO method while the ECW driver implemented it's
> own version of both that and GDALDataset:IRasterIO. What are the
> implications of specializing the second method?

You would implement a custom GDALDataset::IRasterIO() if you want a more
efficient access to multi-band datasets as a single request.  For instance,
ECW implements GDALDataset::IRasterIO() because it is much cheaper to pass
one request on to the ECW API for all the bands of an image at once,
instead of requesting them one at a time.

By the way, one of the applications that we want the GDAL/OPeNDAP driver to
be good with is MapServer.  MapServer currently always makes one big
RasterIO() request for whatever it needs for each band being read.  Eventually
I hope to change this to utilize the GDALDataset::RasterIO() entry point for
greater efficiency where that is specialized.  So, overriding
GDALRasterBand::IRasterIO() or GDALDataset::IRasterIO() will give a big win
for MapServer.

In fact, you might consider just overriding the GDALDataaset::IRasterIO(), and
doing an implementation of GDALRasterBand::IRasterIO() that calls the dataset
level one with a single band requested.  That would set you up optimally for
future improvements in MapServer.

Applications like OpenEV are going to make lots of "tile by tile" RasterIO()
requests to GDAL.  This would presumably turn into a remote request for each
tile which is sensible, but will add a real latency drag into OpenEV renders.

Batch applications are usually scanline based, via RasterIO(), and as mentioned
before it would a wise idea to recognise this case, and force some sort of
chunking.

One final note, when you implement IRasterIO() you avoid data going through
the GDAL cache.  That is good if you are effectively caching things yourself
somehow, but can completely hammer you in some situations if you are not.

Best regards,

-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent