[Mapserver-dev] Raster Queries / Query Caches

Tue Mar 9 13:13:41 EST 2004

Folks,

I am trying to wrap my mind around implementation of raster queries in
MapServer.  To be honest I didn't really have too much understanding of
how queries worked in mapserver, so I did a bit of reading and experimentation
from python MapScript (thanks Sean!).

My understanding of the current query model is that one of several kinds of
query can be issued, either against a single layer or a set of layers.  The
query will cause a "results cache" on each layer to be reset so as to contain
a list of shapes that match satisfy the query.  The results cache currently
just keeps around the shape index (and tile index, and some times the class
index?) of the matching shapes.  The shapes themselves need to be refetched
(using getShape()) in another pass when the query result is used for something.

--

OK, now for rasters.

My mandate is to provide point, rectangle and polygon queries on rasters.
The returned "hits" need to optionally return either the original "raw"
pixel values, or the color that would be used to draw the pixel depending
on some sort of query setting.

My big question is how should I fit this into the query model?

My original thought had been that the query would return a raster array
for the region requested, perhaps with some sort of masking for areas outside
the polygon when in effect.  In fact, I should be able to produce this array by
internally creating a render request at a desired resolution and area.  However,
that is clumsy to access and has no relation to the way query results are
handled anywhere else.  The assumption of returning a regular array also doesn't
mesh well with polygon queries, how multiple layers should be handled, or how
potentially overlapping or different resolution tiles in a tiled layer would work.

A second approach would be to return an array of x/y/pixel_value tuples for each
pixel satisfying the query.  This would require lower level implementation since
the assumption is that it wouldn't go through a "rendering" step, but would
actually access the raster file directly to collect results.  However, the
resultset doesn't match the current query API very well.  That is, there is
no concept of the return results being shapeObj's o which normal template
substitution could be applied.

My third thought was perhaps we could do something similar to the above, but
actually implement the getShape() for raster layers.  Normally getShape() would
fail.  If a query is applied to a raster layer, all the "hits" would be turned
into simple point shapeObjs with an attributes for raw pixel value, and display
red, green and blue values.  Then getShape() would draw from this set of query
results producing pseudo-shapeObjs usable in normal ways from MapScript and
substitutable into normal query templates.

Currently this third approach is in ascendancy.  However, a few things are still
outstanding.

1) How do a cache the results?  With Steve reviewing query caching it would be
quite helpful if there was a concept of persistent memory caches of shapeObjs.
Then I could just populate this during the query operation, and getShape()
on that should be quite trivial.

Alternatively, I could keep a vector of x/y/value/red/green/blue values, and
set the result cache with faked up shape indexes into that array.  Then it
would be fairly easily for me to implement getShape from that cache for the
raster layer.

2) What should be done about queries that return very large result sets?
How do folks avoid accidentally very large query results in the vector
realm?  Would it be sensible to have a "max hits" value in the query API?
Perhaps the default could be unbounded, but a max hits provided by some
mechanism optionally?

3) Are there presentation methods that should be implemented for raster
query results other than the normal template mechanisms used for vectors?

Hmm, as usual, the effort of talking through my approach to a problem has
resulted in substantial evolution in my ideas.  I think I will go with the
3rd approach, remembering query results in memory and making the results
look like shapes, just like a vector layer.

I would be interested in other input on raster query needs or approach.

Best rgards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent