Parallelizing calls to msDrawLayer()

Sat Oct 13 19:17:50 EDT 2007

David Fuhry wrote:
> Tamas,
> 
>    (responses inline)
> 
> Tamas Szekeres wrote:
>> David,
>>
>> I consider it would be reasonable to establish such mechanism only
>> when fetching the data of the layers. Likewise currently the WMS/WFS
>> layers are pre-downloaded in parallel before starting to draw the map.
>> We should have a similar approach when fetching the other layers as
>> well.
> 
>    Yes, I noticed that WMS/WFS layers are downloaded in parallel before 
> rendering begins.  And I agree, it would be advantageous to extend the 
> parallel-data-fetching paradigm to all layers.
> 
>    For non-WMS/WFS layers though, wouldn't it be a significant 
> disruption to the codebase to add lines 1 and 2 into msDrawMap()?
> 
> 1. for i=1 to layers.length (in parallel)
> 2.   data[i] = fetch_data_for_layer(i)

David,

I'm not sure this is a good idea as it might require a huge amount on 
memory to store all the pre-fetched data.

While in general I think mapserver could/should use more memory than it 
does today if it would speed up rendering, I think some tests should be 
done to see where the most bang for the buck is.

With shapefiles I have been about to get 4-5X performance boost (ie: the 
ability to generate 4-5x more map draws/sec) by putting all the 
shapefiles in a ramdisk, which eliminates all disk latency on fetching 
data. This also has the benefit that it is shared by all instances of 
mapserver that are running.

So if you do that, then the bottle neck is rendering and speeding up 
rendering would be great.

-Steve W

> 3. for i=1 to layers.length (serially)
> 4.   msDrawLayer(data[i])
> 
>   ISTM that the data-fetching logic might be best left abstracted 
> beneath msDrawLayer().
> 
>> However pre drawing all of the layers and later copying the layers
>> over the map image seems to be much less efficient.
> 
> Drawing n layers onto n imageObjs is no more expensive than drawing n 
> layers onto one imageObj, and the former can be parallelized across n 
> threads.
> Although yes, I agree that composition (the "merge" step) will cost 
> something.
> I'm entertaining the idea that the time saved by parallel fetching & 
> drawing might outweigh the cost of composition.
> 
>> When using the parallel fetching approach we should deal only with the
>> drivers from the aspect of the thread safety issues.
> 
> I might be misunderstanding your point here, but... Rendering a layer 
> into an independent imageObj should be a pretty independent operation, 
> and could be made so if it's not now.  Glancing at the mapserver 
> thread-safety FAQ, it seems there are more unsafe & locked components 
> related to data-fetching drivers than there are for rendering.  Which 
> makes me wonder why you suggest parallelizing the data-fetching but not 
> the rendering.
> 
> Forgive me if I'm playing a bit of devil's advocate here.  I'm aware 
> that non-reentrant functions don't rewrite themselves, and that critical 
> sections don't surround themselves with mutexes.  Surely though, it 
> ought not to be a tremendous amount amount of work to keep separate 
> layer-drawing operations from stepping on eachothers' toes?
> 
> Thanks,
> 
> Dave Fuhry
> 
>>
>> Best regards,
>>
>> Tamas
>>
>>
>>
>> 2007/10/12, David Fuhry <dfuhry at cs.kent.edu>:
>>> Has anyone looked into parallelizing the calls to msDraw[Query]Layer()
>>> in msDrawMap()?
>>>
>>> Although I'm new to the codebase, it seems that near the top of
>>> msDrawMap(), we could launch a thread for each (non-WMS/WFS) layer,
>>> rendering the layer's output onto its own imageObj.  Then where we now
>>> call msDraw[Query]Layer, wait for thread i to complete, and compose that
>>> layer's imageObj onto the map's imageObj.
>>>
>>> In msDraw[Query]Layer(), critical sections of the mapObj (adding labels
>>> to the label cache, for instance) would need to be protected by a mutex.
>>>
>>> A threaded approach would let some layers get drawn while others are
>>> waiting on I/O or for query results, instead of the current serial
>>> approach where each layer is drawn in turn.  Multiprocessor machines
>>> could schedule the threads across all of their cores for simultaneous
>>> layer rendering.
>>>
>>> It seems this could significantly speed up common-case rendering,
>>> especially on big machines, for very little overhead.  Has there been
>>> previous work in this area, or are any major drawbacks evident?
>>>
>>> Thanks,
>>>
>>> Dave Fuhry
>>>