MS RFC 22a: Feature cache for long running processes and query processing (update)

Frank Warmerdam warmerdam at POBOX.COM
Tue Jun 26 11:24:57 EDT 2007


Tamas Szekeres wrote:
> According to our recent IRC conversation we would rather keep the
> state of the cache preserved across the various sessions and map
> creations. I think if the in-process state preservation is sufficient
> than it could be easily handled inside the caching provider.
> 
> However it seems that the existing connection pooling mechanism cannot
> be utilized for this purpose since the preserved objects are
> distinguished based on the connection parameter and it cannot serve
> the same object for different threads.
> 
> To implement this kind of functionality I could imagine taking out the
> cache related parameters into a separate struct and use a global
> hashtable to store the various cache state instances. The user should
> explicitly specify which instance should the layer use. For example:
> 
> PROCESSING "cache_instance=cache1"
> 
> Where "cache1" is the key to get the cache state from the global hashtable.
> The provider would use the necessary locks when accessing the global
> cache, and msCleanup would destroy the cache when the process
> terminates.

Tamas,

I'm concerned about thread safety issues if a single cache is being
used by multiple threads at the same time.  Are you planning on fine
grained locking around each request for a shape from the cache?
Basically, I think fully thread-shared cache access is going to be
dangerous and should not be implemented.

That aside, the connection pooling approach is still not suitable
because the CONNECTION string for stuff like CACHE layers is not
a very unique name.  We could just modify the pooling API so that
the "connection" name is passed in.  Then for CACHE layers this
could be some more unique name.  Alternatively we could just declare
that folks wanting long lived caches (presumably using the usual
PROCESSING "CLOSE=DEFERRED" line) should take great care to use
unique layer names in their CONNECTION string.

> IMHO, however, it would be highly beneficial to establish a common
> support for the long term state preservation in mapserver. Any of the
> providers could push objects into a common repository and mapserver
> would be responsible to free up the objects when the process
> terminates.

I'm not sure I see the other applications of this.

BTW, I have finally reviewed the whole of RFC 22a, and generally I
am quite impressed.  A few concerns:

1) I wish there were specific sections in the RFC giving detailed
    docs for each of the new data providers (CACHE, LAYERFILTER and
    GEOMTRANS I believe).  I am left with a vague sense that I'm not
    seeing all the layerfilter and geomtrans operators by scanning
    the examples.

2) I am nervous about the "introduced language" for describing
    geometry filtering operations.  I'd be inclined to make this
    particular provider unofficial for 5.0 with the understanding
    that we might want to overhaul the syntax for 5.2.   In particular
    I wish we had a more functional (perhaps Simple Features for SQL-like)
    syntax for geometry operations.

3) I have a nervous sense that nested layers are going to cause us
    unexpected problems.

4) How are nested layers manipulated in mapscript?  I'd like to see
    this addressed in the RFC.

I expect to be in formal support of this RFC when it is brought to vote
though I'm hoping for some clarification on the above points and I'm
not in favor of a cache shared between threads without very careful
analysis of the impact.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org



More information about the mapserver-dev mailing list