[gdal-dev] GDAL-Caching philosophy?

Thu Sep 27 14:00:50 EDT 2001

Petri J. Riipinen wrote:

> Let's say that I have a TIFF-file with one rasterband of size 4000x4000
> with 256-color palette.
> 
> First I use RasterIO to read a block from (1000,1000)-(2000,2000), that is
> 1000x1000 pixels.
> 
> What is left into the RasterIO cache after the operation, assuming that it
> fits inside the maximum cache size?

Petri,

It depends on the block size of the source image, since the caching is done
on a block-by-block basis. If the block size was one scanline (a fairly
common case), then after your read scanlines 1000-2000 would be cached by
GDAL.

> What if the display now shifts one pixel and I read a block
> (1001,1001)-(2001,2001) with rasterIO from the same rasterband, does GDAL
> use anything from the cache or does it read everything from file? I guess I
> mean that is this always equally heavy operation or is there some
> improvement, if the consequtive blocks overlap.

Still assuming scanline oriented blocks, you would only have to read one
more scanline from disk, but of course you would have a certain amount of
overhead just copying the data from the 999 cached blocks into the output
buffer.

> What is actually left in the cache after those two rasterIO-operations?

After the second read you would have 1001 scanline blocks cached assuming
your cache limit was at least 4MB and a bit.

> You know, I just calculated that one 4000x4000 pixel, RGB-image takes over
> 62 megs of memory!!! So, there is no way that I can actually cache 4 of
> those (the worst case, when corners of four adjacent blocks are on the
> viewport) in my 128 megs of memory, so I definitely need to be handling the
> caching and screen updating in some pretty smart way if I want to do it
> like that.

A 4K x 4K RGB image would be 48MB (RGBA would be 64MB). However if the
image is paletted as suggested above it is only 16MB.

> I'm now also considering moving into vector images instead of raster
> images. I'm discussing with Finnish National Landsurvey agency for
> licensing a 1:250 000 scale complete Finnish map as vectors (ESRI
> Shape-format) and I have the option whether to rasterize a suitable set of
> vector layers or just use the vectors. It might be that I might get
> somewhat more agile and responsibe software if I use quadtrees and vector
> shapes, and also not having to worry about caching large image buffers into
> memory.

Note that applications like MapServer do this sort of thing quite effectively,
so if you want to go this way, there is no need to start from scratch.

> Say Frank (or anyone reading this!), you being a pro in GIS-stuff, if you
> would have to do a GPS+map-navigation system, would you use raster or
> vector format, assuming both cover the same area?

If I were doing it I would solve the problem with images. I would create
the TIFF files in tiled format - likely 256x256 tiles, paletted images, and
I would built overviews if I would ever be interested in different zoom levels.
If you output view is 1000x1000 it should be sufficient to set your GDAL
cache size to about 3-4MB to ensure no cache thrashing.

Best regards,

-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam at p...
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent