[gdal-dev] Questions about SQL cursors in GDAL driver

Sat Jul 11 09:12:47 EDT 2009

Jorge,

I think this driver is in a special situation comparing to the file based
raster data sources, that is: we should minimize the round trips to the
server as much as possible which is costly. Therefore in addition to write
IReadBlock correctly (required to fetch a single row in each call or
multiple rows with the non-regular blocking option) we should also consider
to implement some further optimizations.

In the current case GDALRasterBand::IRasterIO would trigger a number of
IReadBlocks to the driver which will cause many individual round-trips to
the server which is a 'sub-optimal' solution in our case.

In this driver - I think - we should actually override
GDALRasterBand::IRasterIO and try to fetch all the blocks by using a spatial
query according to the area requested by the user. Then the driver should
feed the returned blocks into the band block cache by creating a new
GDALRasterBlock for each returned row, and call  GDALRasterBand::AdoptBlock
to add the block to the cache. (see GDALRasterBand::GetLockedBlockRef for
more details how the block handling is done in general). Then you could call
the IRasterIO on the base class to do the rest of the IO operation.
By feeding up the cache with the blocks the base RasterIO will be served
from the block cache instead of calling the IReadBlock-s individually.

But this option doesn't invalidate the requirement of implementing
IReadBlock which is compulsory for each driver to provide the block based
access.

With regards to the cursors it should also be avoided here, because (in
addition to the roundtrips) it requires further resource allocations at the
server which decreases the scalability and by using random access to the
rows may not perform better than accessing the data by using a (clustered)
index search for example.

Best regards,

Tamas

2009/7/11 Jorge Arévalo <jorge.arevalo at gmail.com>

> 2009/7/11 Frank Warmerdam <warmerdam at pobox.com>:
> > Jorge Arévalo wrote:
> >>
> >> So, clearly, I have a mistake. Think in a table with tiles of 100x100
> >> px. We have 30 tiles. When I create RasterBands, their block size will
> >> be 100x100. So, IReadBlock(0, 0, bufffer) indicates the block going
> >> from (0, 0) to (100, 100). Does it mean that the block from (0, 0) to
> >> (100, 100) must be the first one in the table if I get the rows
> >> ordered by rid?
> >
> > No, I do not see how you could depend on this.
>
> OK, understood.
>
> >
> >> It depends on how the tiles have been loaded. So,
> >> instead of making hypothesis, Should I query the block that matchs the
> >> extent from (0, 0) to (100, 100)?
> >
> > Yes, but furthermore, you will need to transform the pixel bounds into
> > georeferenced coordinates to do the spatial query.  I would also note
> that
> > you might be best to reduce the query rectangle to just be a small
> central
> > area of the tile to avoid fetching adjacent tiles or even forcing
> > postgres to fetch them to check their bounds against your point.  Spatial
> > index tend to have a certain granularity.
>
> I can get the center pixel coordinates of the block, transform them
> into georeferenced coordinates (a point) and test which tile's
> envelope contains this point. Right?
>
> Many thanks again
>
> Best regards
> Jorge
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20090711/41168641/attachment.html