[Mapserver-dev] Stream Caching

Wed Oct 20 00:08:09 EDT 2004

Paul,

Having a single structure for each layer that is pooled with a unique key 
might help, but I think it is going to make things complex in some ways.  I 
also am hesitant to break the nice parallelism that is represented in the 
connection pooling API, which greatly streamlined the existing SDE 
connection pooling and outright removed one really sticky issue with the 
freeing of connections in layers that had already been freed.

I don't have time to implement any more optimizations to the code at this 
time, as the speed improvements are not noticeable to my boss and it would 
hard to justify further optimization efforts when he is already satisfied 
with what we've got.

The numbers that Brock sent said that each SE_stream_free and 
get_version_info were costing approximately 20ms per layer.  One free and 
one get_version_info are done for each layer.  2 calls * 20ms per call * 4 
layers = 160ms.  If it is taking 1600ms to render the map, then these four 
calls represent 10% of the rendering time for each map. Where does the 
other 90% of the time go?

There might also be another approach to increase the speed with respect to 
SDE.  As I currently understand things, the layers are serially queried 
from the database.  Would it be possible to make the queries to the 
database in parallel using threads or some other mechanism?  SDE supports 
this kind of access, and maybe this strategy will have benefits to other 
datasources that are heavily I/O bound.  To confirm if something like this 
would be worth the effort and complexity, some detailed profiling with 
gprof or something similar should be done to get a complete breakdown of 
how much time is being spent in every function.

My concern here is that all of these stop-gap approaches in the name of 
speed (wipe out calls to version stuff, pool the streams, do something else 
in the name of speed) are really the "first steps into the quicksand" as 
Frank said this afternoon on IRC.  I agree that we should remove 
unnecessary calls to stuff that is not needed and tidy up what we can in 
the SDE code, but I think we're running into the first in a series of a 
whole bunch of issues now that we have MapServer running in a long-running 
process and can share stuff that takes a long time to do on a per-request 
basis.

If possible, we should gather as much profiling information as we can of 
MapServer in a FastCGI or other long-running processes.  If we can use this 
information and take a step back and take a look a things from this new 
context.  Maybe we can make some structural changes that will support what 
we want without constantly whacking the moles that will continue to pop up.

Howard

At 08:47 PM 10/19/2004, Paul Ramsey wrote:
>Howard,
>What do you think of the stream caching idea? Before we dive into it, your 
>opinions would be very valuable. :) Also, do you want to do the 
>implementation or should we tackle it?
>On a related note, thoughts on caching layerinfo would also be good. It's 
>another fixed-cost thing, that if we don't cache we pay an nlayers * 20ms 
>penalty for.
>Yours,
>Paul