Parallelizing calls to msDrawLayer()

Sat Oct 13 22:40:42 EDT 2007

Ed McNierney wrote:
> David -
> 
> I didn't have the time for a thoughtful reply earlier, and now most
> other folks have already raised some of the concerns I had.  I should
> hesitate more often, I guess - it saves typing <g>.
> 
> I think Steve Woodbridge's comment is informative.  For his
> application he found a 4x-5x improvement by caching data files in a
> RAM disk.  That basically says that something like 80% of his entire
> MapServer rendering time is spent in disk I/O, not drawing.  Many
> users don't have a situation in which they can put their data in RAM

Just for completeness, I was rendering Tiger data, about 22GB on disk 
and I loaded 11GB of the data into a ramdisk. The trick is to load the 
correct 11 GB :) This appears to have had a two effects, the data that 
was needed the most was always in memory and when it had to go to disk 
occasionally for the other data, there was less disk contention and 
seeks happening so the system was generally more responsive. I made sure 
all tileindexs were in the ramdisk also.

Linux filesystem caching is very good, but with 16GB of mem install we 
just did not get the impact that we had hoped for. This is clearly a 
case where I know more about what data is going to be needed that the 
cache can figure out and manage effectively by itself.

> disk.  For a comparable kind of application you would then reasonably
> predict that optimizing multi-layer rendering so it was instantaneous
> would only produce a 20% performance improvement.

Right! If you are blocked on disk IO, then you will not be able to feed 
your renders with data to keep them busy.

There is one area of rendering that can be improved from O(n*n) to about 
an O(n) and that is the label cache processing. There was a discussion 
in the last year on this on the dev list, and I wrote up the algorithm 
for this and sent it to Steve L. but it never made it to an RFC with 
other priorities. If you have a label intensive map, I have seen the 
label cache processing run over 50% of the render time. I have a 
standing offer to aid any dev that is interested in implementing this, 
to assist with all the algorithm support, testing, and resolving any 
technical issues with the algorithm. This would significantly improve 
render performance for maps with labels.

-Steve W

> Although I think MapServer's disk I/O is pretty good, if I were to
> spend time hunting for performance improvements I would be inclined
> to look at the various data I/O schemes.  Anything that can be done
> to reduce disk I/O is a big win (some of those improvements are, of
> course, external to MapServer itself in the form of data organization
> and indexing schemes).
> 
> - Ed
> 
> Ed McNierney Chief Mapmaker Demand Media / TopoZone.com 73 Princeton
> Street, Suite 305 North Chelmsford, MA  01863 Phone: 978-251-4242,
> Fax: 978-251-1396 ed at topozone.com
> 
> 
> 
> -----Original Message----- From: UMN MapServer Developers List
> [mailto:MAPSERVER-DEV at LISTS.UMN.EDU] On Behalf Of David Fuhry Sent:
> Saturday, October 13, 2007 8:39 PM To: MAPSERVER-DEV at LISTS.UMN.EDU 
> Subject: Re: [UMN_MAPSERVER-DEV] Parallelizing calls to msDrawLayer()
> 
> 
> Paul,
> 
> Thanks, that's a good suggestion.
> 
> I guess my thought is, given a really good implementation, a 
> heavily-contended server with a bright scheduler would just end up 
> scheduling the threads sequentially on the same CPU (perhaps likely,
>  since a small bit of the necessary data is in that processor's L1
> cache already).  Then the onus is on the implementer to make sure
> that the extra overhead is pretty low.
> 
> It sort of pushes some of the responsibility to the OS scheduler. 
> Which I think most of the time, will make better decisions than will
> a deterministically-ordered mapserv loop.
> 
> Thanks,
> 
> Dave
> 
> Paul Spencer wrote:
>> David,
>> 
>> While you can perhaps gain some performance in a single map draw,
>> in most real life uses of mapserver, folks are either serving many
>>  simultaneous requests or generating tiles in some way.  I think in
>>  either case, the addition of multi-threaded layer draws will
>> actually cause contention for processor time with the multiple
>> processes that are serving the requests and could hurt overall
>> performance in high load systems.
>> 
>> I think that you could probably get more bang for your development
>> bucks by investing time in profiling the existing code.
>> 
>> Cheers
>> 
>> Paul
>> 
>> On 13-Oct-07, at 6:37 PM, David Fuhry wrote:
>> 
>>> Tamas,
>>> 
>>> (responses inline)
>>> 
>>> Tamas Szekeres wrote:
>>>> David, I consider it would be reasonable to establish such
>>>> mechanism only when fetching the data of the layers. Likewise
>>>> currently the WMS/WFS layers are pre-downloaded in parallel
>>>> before starting to draw the map. We should have a similar
>>>> approach when fetching the other layers as well.
>>> Yes, I noticed that WMS/WFS layers are downloaded in parallel 
>>> before rendering begins.  And I agree, it would be advantageous
>>> to extend the parallel-data-fetching paradigm to all layers.
>>> 
>>> For non-WMS/WFS layers though, wouldn't it be a significant 
>>> disruption to the codebase to add lines 1 and 2 into msDrawMap()?
>>> 
>>> 
>>> 1. for i=1 to layers.length (in parallel) 2.   data[i] =
>>> fetch_data_for_layer(i) 3. for i=1 to layers.length (serially) 4.
>>> msDrawLayer(data[i])
>>> 
>>> ISTM that the data-fetching logic might be best left abstracted 
>>> beneath msDrawLayer().
>>> 
>>>> However pre drawing all of the layers and later copying the
>>>> layers over the map image seems to be much less efficient.
>>> Drawing n layers onto n imageObjs is no more expensive than
>>> drawing n layers onto one imageObj, and the former can be
>>> parallelized across n threads. Although yes, I agree that
>>> composition (the "merge" step) will cost something. I'm
>>> entertaining the idea that the time saved by parallel fetching &
>>>  drawing might outweigh the cost of composition.
>>> 
>>>> When using the parallel fetching approach we should deal only
>>>> with the drivers from the aspect of the thread safety issues.
>>> I might be misunderstanding your point here, but... Rendering a
>>> layer into an independent imageObj should be a pretty independent
>>> operation, and could be made so if it's not now.  Glancing at the
>>> mapserver thread-safety FAQ, it seems there are more unsafe &
>>> locked components related to data-fetching drivers than there are
>>> for rendering.  Which makes me wonder why you suggest
>>> parallelizing the data-fetching but not the rendering.
>>> 
>>> Forgive me if I'm playing a bit of devil's advocate here.  I'm
>>> aware that non-reentrant functions don't rewrite themselves, and
>>> that critical sections don't surround themselves with mutexes.
>>> Surely though, it ought not to be a tremendous amount amount of
>>> work to keep separate layer-drawing operations from stepping on
>>> eachothers' toes?
>>> 
>>> Thanks,
>>> 
>>> Dave Fuhry
>>> 
>>>> Best regards, Tamas 2007/10/12, David Fuhry
>>>> <dfuhry at cs.kent.edu>:
>>>>> Has anyone looked into parallelizing the calls to
>>>>> msDraw[Query]Layer() in msDrawMap()?
>>>>> 
>>>>> Although I'm new to the codebase, it seems that near the top
>>>>> of msDrawMap(), we could launch a thread for each
>>>>> (non-WMS/WFS) layer, rendering the layer's output onto its
>>>>> own imageObj.  Then where we now call msDraw[Query]Layer,
>>>>> wait for thread i to complete, and compose that layer's
>>>>> imageObj onto the map's imageObj.
>>>>> 
>>>>> In msDraw[Query]Layer(), critical sections of the mapObj
>>>>> (adding labels to the label cache, for instance) would need
>>>>> to be protected by a mutex.
>>>>> 
>>>>> A threaded approach would let some layers get drawn while
>>>>> others are waiting on I/O or for query results, instead of
>>>>> the current serial approach where each layer is drawn in
>>>>> turn.  Multiprocessor machines could schedule the threads
>>>>> across all of their cores for simultaneous layer rendering.
>>>>> 
>>>>> It seems this could significantly speed up common-case
>>>>> rendering, especially on big machines, for very little
>>>>> overhead.  Has there been previous work in this area, or are
>>>>> any major drawbacks evident?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Dave Fuhry
>>>>> 
>> +-----------------------------------------------------------------+
>>  |Paul Spencer                          pspencer at dmsolutions.ca
>> | 
>> +-----------------------------------------------------------------+
>>  |Chief Technology Officer
>> | |DM Solutions Group Inc                http://www.dmsolutions.ca/
>> | 
>> +-----------------------------------------------------------------+