Parallelizing calls to msDrawLayer()
dfuhry at CS.KENT.EDU
Sat Oct 13 20:20:47 EDT 2007
Tamas Szekeres wrote:
> 2007/10/14, David Fuhry <dfuhry at cs.kent.edu>:
>> I might be misunderstanding your point here, but... Rendering a layer
>> into an independent imageObj should be a pretty independent operation,
>> and could be made so if it's not now.
> If the vtable functions implemented by the driver are not reentrant
> then the rendering of the layers connected to the same driver is
> definitely dependent. The drawing itself might be created
> independently if mapserver and gd or agg could avoid using global
> variables during the drawings.
Right, that would be necessary.
> The same applies to the drivers as well, however it's quite more
> difficult to audit the code from this aspect because it might as well
> depent on the subsequent libraries. Moreover we should consider not
> only the globals (globally accessed static variables) but also all of
> the potential common resources like database connections file handles
Right, we would have to lock those, or better, make them "expandable".
For example, two threads rendering PostGIS layers would need to render
sequentially in turn, or better, each have their own db connection
(requiring some modification of the currently common db connection code).
>> Glancing at the mapserver
>> thread-safety FAQ, it seems there are more unsafe & locked components
>> related to data-fetching drivers than there are for rendering. Which
>> makes me wonder why you suggest parallelizing the data-fetching but not
>> the rendering.
> Because I expect significantly greater increment in the performance by
> parallelizing the data retrieval than the drawing (+ the extra image
> overlays) itself.
Absolutely agreed. I wonder if it's actually /easier/ to do both (by
wrapping each msDrawLayer() in a thread) than to do just parallel
>> Forgive me if I'm playing a bit of devil's advocate here. I'm aware
>> that non-reentrant functions don't rewrite themselves, and that critical
>> sections don't surround themselves with mutexes.
> Using a mutex in that function would serialize the the operation and
> kill the parallel behaviour definitely. However currently the driver
> operations are quite separated in fairly atomic operations so it
> wouldn't involve too much problems.
Oh, I'm just saying that some additional sections of code here and there
will need to be locked to make them thread-safe. Nothing too
performance critical, I wouldn't think.
>> Surely though, it
>> ought not to be a tremendous amount amount of work to keep separate
>> layer-drawing operations from stepping on eachothers' toes?
> I'm pretty sure currenly parallelizing the data retrieval is more
> trivial that reconstructing the drawing logic inside mapserver.
> For example the LayerWhichShapes data provider functions would trigger
> an asynchronous fetch operation to the data source and later the
> NextShape would serve the retrieved data from the memory when drawing
> the map.
Ah ok, glancing at maplayer.c and mapdraw.c, I'm starting to see what
you mean. So msDrawVectorLayer currently loops like:
while (s = layer->vtable->NextShape())
and your thought is to... buffer the shapes (some of them, or all of
them) with asynchronous NextShape calls, then render the buffer? I
think I fail to grasp the full picture, because what will be going on
while NextShape() asynchronously fetches the next shape(s)? The answer
can't be "nothing", or we fail to exploit parallelism.
Or are you suggesting we fetch the /first/ shape of every layer in
parallel, so as to get the rest of the shapes queued up behind the first
one (depending on the driver, sort of)?
Steve W. had valid concerns that overzealous buffering would use
excessive memory. I see now that msDrawVectorLayer() uses a pipelined
approach which keeps minimal geometry (a single shape) around at once,
leaving buffering decisions to the driver. I like it.
> Best regards,
More information about the mapserver-dev