MS RFC 22a: Feature cache for long running processes and
query processing (update)
Daniel Morissette
dmorissette at MAPGEARS.COM
Thu Jun 28 00:14:48 EDT 2007
Hi Tamas,
Tamas Szekeres wrote:
>
> I personally don't favour the "average user" terminology in this area.
> I consider the
> users as talented developers who spare no trouble to read some
> documentation
> than might help to solve that particular problem they have.
Life would be so much easier for us if all our users were of the type
you describe here. Unfortunately my experience differs from yours: in my
experience I have seen all sorts of users, some beginners, some
experienced, some who read docs and some who don't. I have also learned
that even the talented developers prefer when the software behaves in a
natural way, does "the right thing" by default, and when the interface
to use a feature is simple.
That being said, if what RFC-22a proposes is the simplest possible
solution for the double-pass query then so be it (at least it's a
solution), but you can be assured that this approach will prompt all
sorts of questions from all sorts of users, especially with respect to
the way it solves the double-pass query issue which is my main concern
in this discussion.
The other filtering and processing features sound nice and you're right
that the proposed approach is a very neat solution to those problems,
but they are new problems to me and my main focus in all this is still
this old double-pass query issue.
>
> Or alternatively we could focus only on the 2 pass query problem by not
> utilizing the vtable. It will possibly require either to modify all of
> the mapserver
> code involved in the query operations or modify all of the providers
> suffering
> from this particular problem. This might require a large amount of changes
> in the existing code and would solve at most 20% of the problems I've
> addressed.
>
Well, that 20% (the double-pass query) is the one that keeps coming back
every once in a while. The other 80% are bonus features for which there
has been very little demand so far.
Let's keep in mind that "MapServer is not a full-featured GIS system,
nor does it aspire to be". Transforming features on the fly is nice, but
that kind of processing has never been MapServer's focus. I believe
other tools such as PostGIS support this kind of operations and I always
had a preference for letting them offer those features and letting
MapServer concentrate on what it does best: publish maps on the web.
I think the users who need a Web GIS should be looking more at MapGuide
than MapServer.
>
> With the current implementation the lookup will happen among
> the 1000 shapes since the subsequent query extent will fall inside the
> previous one. However with some line of code we could create an additional
> option to reconstruct the cache in every WhichShapes call.
>
> However because the hashsize have been set to 512 I don't think we
> have to do much of the sequential scans. If the shapes are spread steadily
> across the array on the hashtable we'll have to skip on the average
> of 1.5 items in effect. That's possibly outperform the necessary disk
> accesses spatial index lookups and shape creations.
>
Let's say that the first WhichShapes call loads 1,000 shapes, and then I
do a query by point on that layer. Since there is no spatial index in
memory, all the shapes in the cache will have to be accessed to identify
the ones that are within tolerance of the query location. Sure, looking
up the bounds of 1000 shapes is not a huge cost, but it's a cost, on top
of all the memory used to cache all those shapes.
OTOH, if the data provider supports a spatial index it can find the
matching shapes (2 or 3 shapes in general) with very little work using
its spatial index, removing any benefit of caching and without the cost
of all the memory used to cache features.
Of course if I render the same map area 20 times in a persistent process
then I will benefit from the cache, but I never wrote any MapServer
application that does that. The typical application renders a map once
and then moves to a new area or zooms in a separate request which does
not benefit from caching, so there is little benefit to caching when
rendering a map.
OTOH there would be real benefits to caching the first pass of a
double-pass query since we are assured that we'll read the shapes twice
in this case, and there are usually very few shapes to cache. Thinking
about it some more I think I'd like to see a mode of operation of the
cache that only caches queries.
>
> This is at least one option to use but not compulsory to use.
> We can possibly offer that or continue to leave the users alone with
> this problem.
>
True. At least you have made the effort of trying to find a solution to
the problem (and I have not).
>>
>> - What are the implications of nesting layers on WMS services? I think
>> users will (naively?) expect that the hierarchy of layers will be
>> reflected in the WMS GetCapabilities, but I don't think that this is
>> desirable. This may very well become a FAQ: "Why is the hierarchy of
>> layers in my mapfile not reflected in WMS GetCapabilities?"
>>
>
> Only the root layers participate in the renderings (which are added to the
> layers collection of the map), so there's no need to alter the current
> approach. The nested layers will only behave as data sources for the outer
> layer providers (like the shapefiles or spatial data tables etc. for
> the existing
> providers)
>
I agree that we should not alter the current approach, it would not make
sense to do that, but be prepared to answer questions from users asking
why the hierarchy of layers in a mapfile is not used in a WMS
GetCapabilities.
>
>> and even if it has a NextItem method
>> to walk through all objects, the order of objects is not maintained by a
>> hashtable, so if a user has data sorted (by sortshp) then the sort oder
>> will be lost and rendering order will become pseudo-random if done via a
>> cache layer (unless I'm missing something?).
>>
>
> That's true. I'm not aware of the order of the renderings in this case.
> In my practice I haven't found such a problem it was required.
> However we could use an additional list to treat this issue if it is
> significant.
>
This ordering of shapes at render time is a feature of MapServer, hence
the command-line program sortshp. I don't use it myself but some users
must rely on it otherwise it would not exist. I think it's a sad
side-effect to not try to maintain the ordering but I'll let those who
need this feature fight for it.
Daniel
--
Daniel Morissette
http://www.mapgears.com/
More information about the mapserver-dev
mailing list