Cluster/Supercomputer/HPC variants of Mapserver

Fri Jan 20 06:09:44 PST 2006

If I'm reading between the lines correctly  :c)

You could combine the static data (some background map that can be tuned 
to run very quickly with optimization) and then combine the dynamic data 
at the client with an overlay image.  The dynamic data wouldn't even 
need to go through MapServer necessarily, you could use GD and construct 
a new image overlay for the client to view directly.   You would 
essentially seperate the processes into descrete services this way, and 
combine them at the client.

I've donw the same thing for a number of past projects.   I believe I 
know what you're attemptinf to do here with regard to the data 
collection constraints.

Maybe I said too much . . .  :c),  I can eleaborate more if asked.

bobb

Biz King wrote:

> Hi Ed.
>
> I did understand the difference between data sets and the additional  
> data that is being added.
> Unfortunately, other than the shapefiles we have for our actual  
> topological and feature maps, the data is simply a series of  
> coordinates that require placing on the maps.
> There can be any number of coordinates, from one to several thousand.  
> We have no control whatsoever about that data. We're reporting back  
> on locations and events which may be changing every second.
> These coordinates are generated by the activity that the item(s)  
> carry out. We are required to display that information back to the  
> user(s).
>
> I have reduced the time to create individual images down to an  
> average of around 2 seconds from 8 seconds originally by extended  
> fiddling with the map file(s). Empty data is considerably quicker.
> It may therefore (taking into account everyone's comments) be both  
> simpler and easier to simply use one machine per image, but to have a  
> 'master' machine allocating the work to one machine from a pool of  
> available machines and writing the images out to a central file system.
>
> Thanks for all your help and advice, I'm probably just as frustrated  
> as you guys by being unable to detail the exact situation we are in!
>
> We're hoping our budget is going to expand and let us recruit a full- 
> time mapserver specialist/consultant for this project in the next few  
> weeks.
>
> Anyone UK based (Sheffield area?) please drop me a line off list if  
> interested. Mac experience not required if experienced Unix/Linux  
> user/developer.
>
>
> cheers
>
> Biz
>
> On 20 Jan 2006, at 12:33, Ed McNierney wrote:
>
>> Biz -
>>
>> We were all talking about optimization of the data sets, not the code.
>>
>> It is not clear that your "parallelism" approach described below will
>> help very much.  It is quite possible that it will make things worse.
>> There are quite a few high-performance MapServer implementations in
>> existence already; you seem to be under the impression that your
>> application is quite unusual, and that may not be the case.
>>
>> But if you really can't provide any more information about your
>> application, it's very difficult for us to provide any help.  Please
>> remember, however, that whether you are working on one server or
>> multiple servers, well-organized data will help you considerably.
>>
>>     - Ed
>>
>> Ed McNierney
>> President and Chief Mapmaker
>> TopoZone.com / Maps a la carte, Inc.
>> 73 Princeton Street, Suite 305
>> North Chelmsford, MA  01863
>> Phone: +1 (978) 251-4242
>> Fax: +1 (978) 251-1396
>> ed at topozone.com
>>
>> -----Original Message-----
>> From: UMN MapServer Users List [mailto:MAPSERVER- 
>> USERS at LISTS.UMN.EDU] On
>> Behalf Of Biz King
>> Sent: Friday, January 20, 2006 4:53 AM
>> To: MAPSERVER-USERS at LISTS.UMN.EDU
>> Subject: Re: [UMN_MAPSERVER-USERS] Cluster/Supercomputer/HPC  
>> variants of
>> Mapserver
>>
>> Whilst wholeheartedly agreeing that the code requires substantial
>> optimisation, I think I have arrived at a reasonably logical process
>> that allows me to create a limited parallelization of the serial  
>> process
>> of drawing maps.
>> I intend to segment (tile?) the extents of the parameters being  
>> rendered
>> by different machines into a series of proportional grids, send these
>> parallel requests to cluster/Xgrid members, get the results back
>> (probably as a stream) and then join the stream images back together
>> into one coherent image and save that to the central file system which
>> the web servers access.
>>
>> We're actually having to produce live feedback of item locations (they
>> can move, but can also remain static!) AND item activity (We're  subject
>> to NDA, and cannot give enough detail to explain clearly!
>> Which is irritating.) which we then have to place into a graphical
>> format (map with colour-coded dots indicating location, activity and
>> status). Just to make things interesting, the extents of the data  being
>> displayed changes over time, so a series of pre-defined layer images
>> just won't work as there are no 'standard' coordinate extents that we
>> can work to.
>>
>> The client wants as near real-time feedback as is possible, which
>> compounds things even more!
>>
>> I'll report back with more information as soon as I can.
>>
>> cheers
>> Biz
>> On 19 Jan 2006, at 16:47, David Bitner wrote:
>>
>>> Without getting into any clustering, there is probably a lot of
>>> optimization that you could do to your datasets.  There are a number
>>> of posts in the archives for this list and documentation on
>>> mapserver.gis.umn.edu on doing things like creating overviews at
>>> different resolutions and tiling for rasters that could likely help
>>> speed up the process.
>>>
>>>
>>> On 1/19/06, Biz King <biz.king at mac.com> wrote:
>>>
>>>> Hi All.
>>>>
>>>> Is anyone aware of anywhere (or better still, has experience of)
>>>> running Mapserver via an MPI/Grid interface or as a cluster?
>>>>
>>>> We're trying to develop a high-performance mapserver that can cope
>>>> with the load we're going to be throwing at it!  Currently it takes
>>>> 298 seconds (on a Mac OSX Server, 3.5 Gb Ram, dual 2Ghz processors)
>>>> to do what we need done on under 60 seconds!  There's not much we  can
>>>
>>
>>>> do to cut down the load as we're creating a whole series of nodes on
>>>> a layer via a database and we're then creating the imagery based on
>>>> these items and outputting them to graphics formats in varying  sizes.
>>>>
>>>> The results get fed to users on demand without the delays associated
>>>> with 'on the fly' image creation.
>>>>
>>>> Any help will be welcomed!
>>>>
>>>> cheers
>>>>
>>>> Biz
>>>>
>