Parallelizing calls to msDrawLayer()

Sun Oct 14 11:45:45 EDT 2007

Steve,

Stephen Woodbridge wrote:
> David,
> 
> I think the point that you are perhaps missing is that in a production 
> environment, mapserver is already multithreaded in macro sense. I 
> typically have 5-10 mapserver instances running at any given moment. 
> These are all fetching data, rendering images and spool them back to the 
> clients. Any process that is blocked by disk IO is allowing some other 
> process to be rendering, spooling or whatever. Since IO is the bottle 
> neck and in fact disk IO there is little you can do to mapserver to 
> improve that.

I usually call that multiprocess, since there are multiple mapserv 
processes running.  But yes, if every core in your system is always 
saturated with CPU work, and if each of your disks always has a pending 
request, then multithreading will not help you.

> Your example is good if you only have one process running like in a 
> desktop application, but I'm not sure you get the same benefits when you 
> are running a multithreaded webserver and mapserver. And as other have 
> suggested, you are likely to create some thrashing if you generate even 
> more disk IO requests that the OS has to service.

Not more IO requests.  The same number of IO requests, but we ask the OS 
for them sooner.  See my latest email to Ed for some comments on thrashing.

> If your goal it to improve performance, then I am reminded if Ed's good 
> advice to me in the past about premature optimizations. If you optimize 
> before you identify the bottle neck, you might be making great 
> improvements on a bottle neck that only represents 1% of the performance 
> problem. So if you can make 100% improvement in that area, you have only 
> removed 1% of the total problem. It would be better to make a 10% 
> improvement on something that represented 50% of the problem. Or said in 
> other words: test and measure before you optimize.

Yes, premature optimization, and Amdahl's law.  I hear you.

> All that said, I would love to get any performance improvements anyone 
> is willing to make.

Yeah exactly.  This is just an idea.  I want to know, if doing this can 
make mapserver even faster, what the limitations and opportunities are.

Thanks,

Dave

> -Steve
> 
> David Fuhry wrote:
>> Ed,
>>
>>    Indeed.  I'm very appreciative of all your guys' comments.  I 
>> should clarify one point.
>>
>>    I agree that it would not be worthwhile to just parallelize 
>> rendering.  What I thing might be worthwhile, is to parallelize both 
>> rendering *and* the I/O that necessarily precedes it.  I view the 
>> latter as (through no fault of mapserver's) the bottleneck, and the 
>> former as perhaps icing on the cake.  Since the late bird doesn't get 
>> the worm ;), Ed, may I use you for an example?
>>
>>    You serve lots of raster.  Let's say your mapfile is composed of a 
>> base raster layer and a few vector layers on top.  A request is made. 
>> mapserv goes to render the first layer.  The tileindex is probably in 
>> the page buffer (memory), so it looks up the tile(s) quick and goes to 
>> fetch the raster image(s).
>>
>>    Maybe the images have to come across NFS.  The request goes over 
>> GigEth; ping / 2 says this takes 36ms.  The fileserver seeks its 
>> 7200RPM disk to the start of the TIFF; Seagate says this takes 8.5ms.  
>> Let's say the sequential read & transfer back take zero time, except 
>> the 36ms lower-bound on network time.
>>
>>    What has mapserv done in this 36 + 8.5 + 36 = 80.5ms?  Nothing. 
>> Just waited on I/O.  It could have perhaps rendered several vector 
>> layers from ramdisk in this time.
>>
>>    Now Frank's GDAL goes to work mosaicing/clipping/warping/resizing 
>> the raster image(s).  It's chewing CPU.  Let's say that layers 2 thru 
>> n are shapefiles on the local disk.  What is the local disk doing 
>> during this time?  Nothing.  It could be checking for the existence of 
>> layer2.qix, or seeking to the start of layer2.shp, or likewise for 
>> layers3 thru n.  None of which will interfere with GDAL's work. 
>> Instead, we wait until GDAL is through, then waste 8.5ms seeking to 
>> the start of layer2.shp.  We could have sought layer2.shp even 
>> earlier, while waiting on the NFS request.
>>
>>    The thing which seems beautiful to me, is that OS schedulers (both 
>> process & I/O) are designed to be good at receiving a bunch of 
>> requests, and resolving them efficiently.  I think there may be value 
>> in launching a thread for each layer, thus throwing all the requests 
>> up against the OS at once, and letting its schedulers try to make the 
>> best use of CPU and I/O resources.  I have to imagine that they are 
>> likely to do better than having everything wait in line.
>>
>>    Yes, this would put more of a strain on the server at that instant. 
>>  By pipelining I/O though, it will also return a map quicker.
>>
>> Thanks,
>>
>> Dave
>>
>>
>> Ed McNierney wrote:
>>> David -
>>>
>>> I didn't have the time for a thoughtful reply earlier, and now most 
>>> other folks have already raised some of the concerns I had.  I should 
>>> hesitate more often, I guess - it saves typing <g>.
>>>
>>> I think Steve Woodbridge's comment is informative.  For his 
>>> application he found a 4x-5x improvement by caching data files in a 
>>> RAM disk.  That basically says that something like 80% of his entire 
>>> MapServer rendering time is spent in disk I/O, not drawing.  Many 
>>> users don't have a situation in which they can put their data in RAM 
>>> disk.  For a comparable kind of application you would then reasonably 
>>> predict that optimizing multi-layer rendering so it was instantaneous 
>>> would only produce a 20% performance improvement.
>>>
>>> Although I think MapServer's disk I/O is pretty good, if I were to 
>>> spend time hunting for performance improvements I would be inclined 
>>> to look at the various data I/O schemes.  Anything that can be done 
>>> to reduce disk I/O is a big win (some of those improvements are, of 
>>> course, external to MapServer itself in the form of data organization 
>>> and indexing schemes).
>>>
>>>     - Ed
>>>
>>> Ed McNierney
>>> Chief Mapmaker
>>> Demand Media / TopoZone.com
>>> 73 Princeton Street, Suite 305
>>> North Chelmsford, MA  01863
>>> Phone: 978-251-4242, Fax: 978-251-1396
>>> ed at topozone.com
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: UMN MapServer Developers List 
>>> [mailto:MAPSERVER-DEV at LISTS.UMN.EDU] On Behalf Of David Fuhry
>>> Sent: Saturday, October 13, 2007 8:39 PM
>>> To: MAPSERVER-DEV at LISTS.UMN.EDU
>>> Subject: Re: [UMN_MAPSERVER-DEV] Parallelizing calls to msDrawLayer()
>>>
>>> Paul,
>>>
>>>     Thanks, that's a good suggestion.
>>>
>>>     I guess my thought is, given a really good implementation, a 
>>> heavily-contended server with a bright scheduler would just end up 
>>> scheduling the threads sequentially on the same CPU (perhaps likely, 
>>> since a small bit of the necessary data is in that processor's L1 
>>> cache already).  Then the onus is on the implementer to make sure 
>>> that the extra overhead is pretty low.
>>>
>>>     It sort of pushes some of the responsibility to the OS scheduler. 
>>> Which I think most of the time, will make better decisions than will 
>>> a deterministically-ordered mapserv loop.
>>>
>>> Thanks,
>>>
>>> Dave
>>>
>>> Paul Spencer wrote:
>>>> David,
>>>>
>>>> While you can perhaps gain some performance in a single map draw, in 
>>>> most real life uses of mapserver, folks are either serving many 
>>>> simultaneous requests or generating tiles in some way.  I think in 
>>>> either case, the addition of multi-threaded layer draws will 
>>>> actually cause contention for processor time with the multiple 
>>>> processes that are serving the requests and could hurt overall 
>>>> performance in high load systems.
>>>>
>>>> I think that you could probably get more bang for your development 
>>>> bucks by investing time in profiling the existing code.
>>>>
>>>> Cheers
>>>>
>>>> Paul
>>>>
>>>> On 13-Oct-07, at 6:37 PM, David Fuhry wrote:
>>>>
>>>>> Tamas,
>>>>>
>>>>>    (responses inline)
>>>>>
>>>>> Tamas Szekeres wrote:
>>>>>> David,
>>>>>> I consider it would be reasonable to establish such mechanism only
>>>>>> when fetching the data of the layers. Likewise currently the WMS/WFS
>>>>>> layers are pre-downloaded in parallel before starting to draw the 
>>>>>> map.
>>>>>> We should have a similar approach when fetching the other layers as
>>>>>> well.
>>>>>    Yes, I noticed that WMS/WFS layers are downloaded in parallel 
>>>>> before rendering begins.  And I agree, it would be advantageous to 
>>>>> extend the parallel-data-fetching paradigm to all layers.
>>>>>
>>>>>    For non-WMS/WFS layers though, wouldn't it be a significant 
>>>>> disruption to the codebase to add lines 1 and 2 into msDrawMap()?
>>>>>
>>>>> 1. for i=1 to layers.length (in parallel)
>>>>> 2.   data[i] = fetch_data_for_layer(i)
>>>>> 3. for i=1 to layers.length (serially)
>>>>> 4.   msDrawLayer(data[i])
>>>>>
>>>>>   ISTM that the data-fetching logic might be best left abstracted 
>>>>> beneath msDrawLayer().
>>>>>
>>>>>> However pre drawing all of the layers and later copying the layers
>>>>>> over the map image seems to be much less efficient.
>>>>> Drawing n layers onto n imageObjs is no more expensive than drawing 
>>>>> n layers onto one imageObj, and the former can be parallelized 
>>>>> across n threads.
>>>>> Although yes, I agree that composition (the "merge" step) will cost 
>>>>> something.
>>>>> I'm entertaining the idea that the time saved by parallel fetching 
>>>>> & drawing might outweigh the cost of composition.
>>>>>
>>>>>> When using the parallel fetching approach we should deal only with 
>>>>>> the
>>>>>> drivers from the aspect of the thread safety issues.
>>>>> I might be misunderstanding your point here, but... Rendering a 
>>>>> layer into an independent imageObj should be a pretty independent 
>>>>> operation, and could be made so if it's not now.  Glancing at the 
>>>>> mapserver thread-safety FAQ, it seems there are more unsafe & 
>>>>> locked components related to data-fetching drivers than there are 
>>>>> for rendering.  Which makes me wonder why you suggest parallelizing 
>>>>> the data-fetching but not the rendering.
>>>>>
>>>>> Forgive me if I'm playing a bit of devil's advocate here.  I'm 
>>>>> aware that non-reentrant functions don't rewrite themselves, and 
>>>>> that critical sections don't surround themselves with mutexes.  
>>>>> Surely though, it ought not to be a tremendous amount amount of 
>>>>> work to keep separate layer-drawing operations from stepping on 
>>>>> eachothers' toes?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dave Fuhry
>>>>>
>>>>>> Best regards,
>>>>>> Tamas
>>>>>> 2007/10/12, David Fuhry <dfuhry at cs.kent.edu>:
>>>>>>> Has anyone looked into parallelizing the calls to 
>>>>>>> msDraw[Query]Layer()
>>>>>>> in msDrawMap()?
>>>>>>>
>>>>>>> Although I'm new to the codebase, it seems that near the top of
>>>>>>> msDrawMap(), we could launch a thread for each (non-WMS/WFS) layer,
>>>>>>> rendering the layer's output onto its own imageObj.  Then where 
>>>>>>> we now
>>>>>>> call msDraw[Query]Layer, wait for thread i to complete, and 
>>>>>>> compose that
>>>>>>> layer's imageObj onto the map's imageObj.
>>>>>>>
>>>>>>> In msDraw[Query]Layer(), critical sections of the mapObj (adding 
>>>>>>> labels
>>>>>>> to the label cache, for instance) would need to be protected by a 
>>>>>>> mutex.
>>>>>>>
>>>>>>> A threaded approach would let some layers get drawn while others are
>>>>>>> waiting on I/O or for query results, instead of the current serial
>>>>>>> approach where each layer is drawn in turn.  Multiprocessor machines
>>>>>>> could schedule the threads across all of their cores for 
>>>>>>> simultaneous
>>>>>>> layer rendering.
>>>>>>>
>>>>>>> It seems this could significantly speed up common-case rendering,
>>>>>>> especially on big machines, for very little overhead.  Has there 
>>>>>>> been
>>>>>>> previous work in this area, or are any major drawbacks evident?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Dave Fuhry
>>>>>>>
>>>> +-----------------------------------------------------------------+
>>>> |Paul Spencer                          pspencer at dmsolutions.ca    |
>>>> +-----------------------------------------------------------------+
>>>> |Chief Technology Officer                                         |
>>>> |DM Solutions Group Inc                http://www.dmsolutions.ca/ |
>>>> +-----------------------------------------------------------------+