Cluster/Supercomputer/HPC variants of Mapserver

Fri Jan 20 05:58:06 PST 2006

Hi Ed.

I did understand the difference between data sets and the additional  
data that is being added.
Unfortunately, other than the shapefiles we have for our actual  
topological and feature maps, the data is simply a series of  
coordinates that require placing on the maps.
There can be any number of coordinates, from one to several thousand.  
We have no control whatsoever about that data. We're reporting back  
on locations and events which may be changing every second.
These coordinates are generated by the activity that the item(s)  
carry out. We are required to display that information back to the  
user(s).

I have reduced the time to create individual images down to an  
average of around 2 seconds from 8 seconds originally by extended  
fiddling with the map file(s). Empty data is considerably quicker.
It may therefore (taking into account everyone's comments) be both  
simpler and easier to simply use one machine per image, but to have a  
'master' machine allocating the work to one machine from a pool of  
available machines and writing the images out to a central file system.

Thanks for all your help and advice, I'm probably just as frustrated  
as you guys by being unable to detail the exact situation we are in!

We're hoping our budget is going to expand and let us recruit a full- 
time mapserver specialist/consultant for this project in the next few  
weeks.

Anyone UK based (Sheffield area?) please drop me a line off list if  
interested. Mac experience not required if experienced Unix/Linux  
user/developer.

cheers

Biz

On 20 Jan 2006, at 12:33, Ed McNierney wrote:

> Biz -
>
> We were all talking about optimization of the data sets, not the code.
>
> It is not clear that your "parallelism" approach described below will
> help very much.  It is quite possible that it will make things worse.
> There are quite a few high-performance MapServer implementations in
> existence already; you seem to be under the impression that your
> application is quite unusual, and that may not be the case.
>
> But if you really can't provide any more information about your
> application, it's very difficult for us to provide any help.  Please
> remember, however, that whether you are working on one server or
> multiple servers, well-organized data will help you considerably.
>
> 	- Ed
>
> Ed McNierney
> President and Chief Mapmaker
> TopoZone.com / Maps a la carte, Inc.
> 73 Princeton Street, Suite 305
> North Chelmsford, MA  01863
> Phone: +1 (978) 251-4242
> Fax: +1 (978) 251-1396
> ed at topozone.com
>
> -----Original Message-----
> From: UMN MapServer Users List [mailto:MAPSERVER- 
> USERS at LISTS.UMN.EDU] On
> Behalf Of Biz King
> Sent: Friday, January 20, 2006 4:53 AM
> To: MAPSERVER-USERS at LISTS.UMN.EDU
> Subject: Re: [UMN_MAPSERVER-USERS] Cluster/Supercomputer/HPC  
> variants of
> Mapserver
>
> Whilst wholeheartedly agreeing that the code requires substantial
> optimisation, I think I have arrived at a reasonably logical process
> that allows me to create a limited parallelization of the serial  
> process
> of drawing maps.
> I intend to segment (tile?) the extents of the parameters being  
> rendered
> by different machines into a series of proportional grids, send these
> parallel requests to cluster/Xgrid members, get the results back
> (probably as a stream) and then join the stream images back together
> into one coherent image and save that to the central file system which
> the web servers access.
>
> We're actually having to produce live feedback of item locations (they
> can move, but can also remain static!) AND item activity (We're  
> subject
> to NDA, and cannot give enough detail to explain clearly!
> Which is irritating.) which we then have to place into a graphical
> format (map with colour-coded dots indicating location, activity and
> status). Just to make things interesting, the extents of the data  
> being
> displayed changes over time, so a series of pre-defined layer images
> just won't work as there are no 'standard' coordinate extents that we
> can work to.
>
> The client wants as near real-time feedback as is possible, which
> compounds things even more!
>
> I'll report back with more information as soon as I can.
>
> cheers
> Biz
> On 19 Jan 2006, at 16:47, David Bitner wrote:
>
>> Without getting into any clustering, there is probably a lot of
>> optimization that you could do to your datasets.  There are a number
>> of posts in the archives for this list and documentation on
>> mapserver.gis.umn.edu on doing things like creating overviews at
>> different resolutions and tiling for rasters that could likely help
>> speed up the process.
>>
>>
>> On 1/19/06, Biz King <biz.king at mac.com> wrote:
>>> Hi All.
>>>
>>> Is anyone aware of anywhere (or better still, has experience of)
>>> running Mapserver via an MPI/Grid interface or as a cluster?
>>>
>>> We're trying to develop a high-performance mapserver that can cope
>>> with the load we're going to be throwing at it!  Currently it takes
>>> 298 seconds (on a Mac OSX Server, 3.5 Gb Ram, dual 2Ghz processors)
>>> to do what we need done on under 60 seconds!  There's not much we  
>>> can
>
>>> do to cut down the load as we're creating a whole series of nodes on
>>> a layer via a database and we're then creating the imagery based on
>>> these items and outputting them to graphics formats in varying  
>>> sizes.
>>>
>>> The results get fed to users on demand without the delays associated
>>> with 'on the fly' image creation.
>>>
>>> Any help will be welcomed!
>>>
>>> cheers
>>>
>>> Biz
>>>