[gdal-dev] Re: SoC Report: GDAL TMS driver

Mon Aug 4 14:54:26 EDT 2008

Václav Klusák wrote:
> They weren't of course. Mainly because gdal_translate (which I have been 
> using for testing) doesn't even call Dataset::RasterIO. It loops through 
> the raster bands and adds them one after each other to the output 
> dataset. Exactly what it must not do in order to achieve optimality. And 
> this is not an exception. The notion of raster band seems to be pilar, 
> and the various parts of GDAL code use them frequently.
> 
> So, what do I do?
> 
> One theoreticaly possible way to optimise writing is to just store IO 
> requests in some data structure and only in the FlushCache method 
> optimise and actualy do them. But nothing like this is possible for 
> reading since the caller expects the data to be transferred at the end 
> of the call. I don't see a way out, the TMS driver will be slow and will 
> thrash the hard drive.
> 
> To sum up what I have in my hands right now:
> 
> Reading TMS datasets works. The infrastructure for writing blocks is 
> mostly in place (ten or so lines missing) but I don't have the code that 
> creates new datasets in the filesystem yet. The cache works but should 
> be improved a little. All this could be finished in a day or two. After 
> that comes the rest of GDALDataset boilerplate: transformations, GCPs 
> etc. These I haven't studied much yet.

Keo,

A few suggestions:

1) Per http://trac.osgeo.org/gdal/wiki/rfc14_imagestructure try setting
    the IMAGE_STRUCTURE INTERLEAVE metadata item to PIXEL if the data
    is pixel interleaved.  Some applications and CreateCopy() implementations
    will take advantage of this clue.

2) In your IReadBlock() method try pushing the other bands than the one
    requested into the cache.  This will of course result in extra cache
    churn, but with a big enough cache, and depending on the application
    io strategy this can mean the other bands are already in the cache
    when they are needed.

3) We might consider possible changes to GDALDatasetCopyWholeRaster() for
    tiled data so that whole rows of tiles do not need to be cached.  It
    is in gdal/gcore/rasterio.cpp.  It is used by some of the CreateCopy()
    implementations.

4) To some extent we may just need to educate users to use large cache
    size settings for wide TMS datasets.

The problems you are encountering aren't that different than faced by any
other pixel interleaved tiled format.  They are just a bit more severe with
TMS because we expect large images, the tiles are unusually large, and we
need two rows instead of one.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org