[Gdal-dev] Implementing GDALRasterBand::IRasterIO
James Gallagher
jgallagher at gso.uri.edu
Wed Jan 14 16:53:52 EST 2004
On Wed, 2004-01-14 at 12:01, Frank Warmerdam wrote:
> James Gallagher wrote:
> > Hi,
> >
> > I would like to add sub-sampling capabilities to the OPeNDAP/GDAL
> > driver. To do this I plan on specializing GDALRasterBand::IRasterIO().
> >
> > First question: is that method the correct place to implement
> > sub-sampling?
>
> James,
>
> It depends a bit what you mean by sub-sampling. If you want a client
> application to be able to request a sub-area at full resolution and
> the driver would be able to fetch just the subarea
That's part of what I mean (the most important part). I can read reduced
resolution rasters over the net, but if a caller asks for the same
raster at higher resolution I'll have to read the whole thing unless I
write more code.
> then overriding
> IRasterIO() would be one approach. Another would be to pretend that
> the data is tiled, and just implement the IBlockRead() method.
Right now my implementation of IReadBlock() can only read the whole
raster. That is, in Open() I set the block size to the raster size.
There might be a better size like 1024^2 blocks, but it's so dependent
on the connection's bandwidth that you'd really want to calculate on a
per connection basis. I think that's a bit much for the first version...
> If you want to offer an efficient way to accessed reduced resolution
> images, then you could produce pseudo-overview layers.
>
> However, given that for the OPeNDAP driver you want alot of control
> over how many individual requests actually go to the remote server,
> I would say overriding IRasterIO() makes sense.
OK. That's what I'll do first.
> Be aware that:
> o applications normally call RasterIO(), so you will now very rarely
> get block accesses - however, it can still happen. For instance, I
> think the min/max computation goes through the block API.
OK. I also noticed that GDAL's caching uses it too.
> o It is very common for applications to make many small RasterIO()
> calls, often for one scanline at a time. I often try to recognise
> the scanline request case, and pre-read a bunch of scanlines at once
> and cache them.
Where would I look to find out more about GDAL's data caching system?
> > Second question: I noticed that the OGDI driver implemented only the
> > GDALRasterBand::IRasterIO method while the ECW driver implemented it's
> > own version of both that and GDALDataset:IRasterIO. What are the
> > implications of specializing the second method?
>
> You would implement a custom GDALDataset::IRasterIO() if you want a more
> efficient access to multi-band datasets as a single request. For instance,
> ECW implements GDALDataset::IRasterIO() because it is much cheaper to pass
> one request on to the ECW API for all the bands of an image at once,
> instead of requesting them one at a time.
OK. Sounds like I should look into this a bit once I get the basic
access (IRasterIO) working.
> By the way, one of the applications that we want the GDAL/OPeNDAP driver to
> be good with is MapServer. MapServer currently always makes one big
> RasterIO() request for whatever it needs for each band being read. Eventually
> I hope to change this to utilize the GDALDataset::RasterIO() entry point for
> greater efficiency where that is specialized. So, overriding
> GDALRasterBand::IRasterIO() or GDALDataset::IRasterIO() will give a big win
> for MapServer.
Sounds great.
> In fact, you might consider just overriding the GDALDataaset::IRasterIO(), and
> doing an implementation of GDALRasterBand::IRasterIO() that calls the dataset
> level one with a single band requested. That would set you up optimally for
> future improvements in MapServer.
OK. I think I see how this would be done. DODSDataset::IRasterIO() would
be in charge of actually reading the data (maybe using GDAL's caching
sub-system) and DODSRasterBand::IRasterIO() would make calls to it. In
the default implementation it's the other way around (Dataset call
RasterBand).
> Applications like OpenEV are going to make lots of "tile by tile" RasterIO()
> requests to GDAL. This would presumably turn into a remote request for each
> tile which is sensible, but will add a real latency drag into OpenEV renders.
>
> Batch applications are usually scanline based, via RasterIO(), and as mentioned
> before it would a wise idea to recognise this case, and force some sort of
> chunking.
>
> One final note, when you implement IRasterIO() you avoid data going through
> the GDAL cache. That is good if you are effectively caching things yourself
> somehow, but can completely hammer you in some situations if you are not.
Well the cache I implemented is a basic HTTP/1.1 cache. It doesn't yet
know about the stuff inside a data object. So it's not a very good cache
for data. I think the GDAL cache is going to be important.
Thanks for the info,
James
>
> Best regards,
--
__________________________________________________________________________
James Gallagher The Distributed Oceanographic Data System
jgallagher at gso.uri.edu http://unidata.ucar.edu/packages/dods
Voice/Fax: 406.723.8663
More information about the Gdal-dev
mailing list