[Mapserver-dev] Query efficiency

Steve Lime steve.lime at dnr.state.mn.us
Thu Mar 4 14:02:00 EST 2004

Hi Frank: I tend to favor this approach in the short term too. We could
put some
limits in place to control the maximum cache size I suppose. Maximum
number of
features would be one way. How easy is it to get the cummulative size
(in Kb) of 
a linked list of shapeObj's? The nice thing is the code to manage a
feature list 
already exists.

I spoke to why queries can't be processed in pass in some of the other
emails so
I won't go into details unless you really want them. Bottom line is
that a query 
result set can be used in lots of ways, through templates, in
MapScript, to make
maps (QueryMap) and even to direct other queries (mode=FeatureQuery).


Stephen Lime
Data & Applications Manager

Minnesota DNR
500 Lafayette Road
St. Paul, MN 55155

>>> Frank Warmerdam <warmerdam at pobox.com> 3/3/2004 2:01:05 PM >>>
Steve Lime wrote:
> Sean: Queries work by first generating a candidate result set and
> operating on that result set within MapServer (applying classes
> Queries cannot be completely executed in the underlying RDBMS (as
> code sits). So there's this disjoint relationship between the result
> and the database. The fix would be to enable all queries in a vendor
> specific way and then maintain access to the result set using the
> msLayerNextShape() function. 
> The current code gives us very consistent results between
> because the same algorithms (good or bad) are used for everything.
> Unfortunately is doesn't let us tap into the power of the database
> except for attribute queries.


My first pass opinion is that all the results of a query should be
held in memory as shapeObj's and that memory cache reused for
subsequent parts
of the query operation.  This would ensure consistent behaviour for all
but eliminate the extra pass queries that occur now and that can have
awful performance characteristics in some cases.

The obvious downside to this approach is that the memory cache of
shapeObj's could potentially be large.  Even large enough to
bring the system to it's knees in a worst case.  However, rather than
alot more logic down into the datasources, or trying to cache to disk,
I think
it would be better to just provide better tools for the query, and to
some sort of "maxresults" option to control the number of shapeObj's
that will
be collected as a query result.

This sort of change would be quite simple, and very fast for cases
where the
result set isn't gargantuan.

However, there are still aspects of the query architecture as it
that I don't really understand.  I'm not sure why a the shapes from a
set can't be fully processed on the first pass.

Best regards,
I set the clouds in motion - turn up   | Frank Warmerdam,
warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for

More information about the mapserver-dev mailing list