Cluster/Supercomputer/HPC variants of Mapserver

David Bitner osgis.lists at GMAIL.COM
Fri Jan 20 06:53:46 PST 2006


How are you storing these point coordinates that continually change? 
With the details that I am seeing, it looks like some of your issues
could really be helped through using a spatial database like
PostgreSQL/PostGIS.  That way the database can do the heavy lifting of
selecting which points are to be displayed and then mapserver can just
do the easy draw.

Mapserver is really quite efficient at drawing if you feed it well
thought out datasets that are indexed, tiled, generalized if
appropriate.  Also if you are using any background datasets where you
are only using part of the features in the dataset (for example if at
a certain scale you are only displaying interstates, there is no
reason to filter through all the local roads when you could just
pre-make a shapefile with well generalized interstates).

There are folks like Ed who has been replying to this thread who are
serving up many Terrabytes of data.  My impression is that even at
that level (Ed correct me if I'm wrong), most of the optimization is
done to make sure that Mapserver gets as close to only the data that
it needs to display as possible.

On 1/20/06, Biz King <biz.king at mac.com> wrote:
> Hi Ed.
>
> I did understand the difference between data sets and the additional
> data that is being added.
> Unfortunately, other than the shapefiles we have for our actual
> topological and feature maps, the data is simply a series of
> coordinates that require placing on the maps.
> There can be any number of coordinates, from one to several thousand.
> We have no control whatsoever about that data. We're reporting back
> on locations and events which may be changing every second.
> These coordinates are generated by the activity that the item(s)
> carry out. We are required to display that information back to the
> user(s).
>
> I have reduced the time to create individual images down to an
> average of around 2 seconds from 8 seconds originally by extended
> fiddling with the map file(s). Empty data is considerably quicker.
> It may therefore (taking into account everyone's comments) be both
> simpler and easier to simply use one machine per image, but to have a
> 'master' machine allocating the work to one machine from a pool of
> available machines and writing the images out to a central file system.
>
> Thanks for all your help and advice, I'm probably just as frustrated
> as you guys by being unable to detail the exact situation we are in!
>
> We're hoping our budget is going to expand and let us recruit a full-
> time mapserver specialist/consultant for this project in the next few
> weeks.
>
> Anyone UK based (Sheffield area?) please drop me a line off list if
> interested. Mac experience not required if experienced Unix/Linux
> user/developer.
>
>
> cheers
>
> Biz
>
> On 20 Jan 2006, at 12:33, Ed McNierney wrote:
>
> > Biz -
> >
> > We were all talking about optimization of the data sets, not the code.
> >
> > It is not clear that your "parallelism" approach described below will
> > help very much.  It is quite possible that it will make things worse.
> > There are quite a few high-performance MapServer implementations in
> > existence already; you seem to be under the impression that your
> > application is quite unusual, and that may not be the case.
> >
> > But if you really can't provide any more information about your
> > application, it's very difficult for us to provide any help.  Please
> > remember, however, that whether you are working on one server or
> > multiple servers, well-organized data will help you considerably.
> >
> >       - Ed
> >
> > Ed McNierney
> > President and Chief Mapmaker
> > TopoZone.com / Maps a la carte, Inc.
> > 73 Princeton Street, Suite 305
> > North Chelmsford, MA  01863
> > Phone: +1 (978) 251-4242
> > Fax: +1 (978) 251-1396
> > ed at topozone.com
> >
> > -----Original Message-----
> > From: UMN MapServer Users List [mailto:MAPSERVER-
> > USERS at LISTS.UMN.EDU] On
> > Behalf Of Biz King
> > Sent: Friday, January 20, 2006 4:53 AM
> > To: MAPSERVER-USERS at LISTS.UMN.EDU
> > Subject: Re: [UMN_MAPSERVER-USERS] Cluster/Supercomputer/HPC
> > variants of
> > Mapserver
> >
> > Whilst wholeheartedly agreeing that the code requires substantial
> > optimisation, I think I have arrived at a reasonably logical process
> > that allows me to create a limited parallelization of the serial
> > process
> > of drawing maps.
> > I intend to segment (tile?) the extents of the parameters being
> > rendered
> > by different machines into a series of proportional grids, send these
> > parallel requests to cluster/Xgrid members, get the results back
> > (probably as a stream) and then join the stream images back together
> > into one coherent image and save that to the central file system which
> > the web servers access.
> >
> > We're actually having to produce live feedback of item locations (they
> > can move, but can also remain static!) AND item activity (We're
> > subject
> > to NDA, and cannot give enough detail to explain clearly!
> > Which is irritating.) which we then have to place into a graphical
> > format (map with colour-coded dots indicating location, activity and
> > status). Just to make things interesting, the extents of the data
> > being
> > displayed changes over time, so a series of pre-defined layer images
> > just won't work as there are no 'standard' coordinate extents that we
> > can work to.
> >
> > The client wants as near real-time feedback as is possible, which
> > compounds things even more!
> >
> > I'll report back with more information as soon as I can.
> >
> > cheers
> > Biz
> > On 19 Jan 2006, at 16:47, David Bitner wrote:
> >
> >> Without getting into any clustering, there is probably a lot of
> >> optimization that you could do to your datasets.  There are a number
> >> of posts in the archives for this list and documentation on
> >> mapserver.gis.umn.edu on doing things like creating overviews at
> >> different resolutions and tiling for rasters that could likely help
> >> speed up the process.
> >>
> >>
> >> On 1/19/06, Biz King <biz.king at mac.com> wrote:
> >>> Hi All.
> >>>
> >>> Is anyone aware of anywhere (or better still, has experience of)
> >>> running Mapserver via an MPI/Grid interface or as a cluster?
> >>>
> >>> We're trying to develop a high-performance mapserver that can cope
> >>> with the load we're going to be throwing at it!  Currently it takes
> >>> 298 seconds (on a Mac OSX Server, 3.5 Gb Ram, dual 2Ghz processors)
> >>> to do what we need done on under 60 seconds!  There's not much we
> >>> can
> >
> >>> do to cut down the load as we're creating a whole series of nodes on
> >>> a layer via a database and we're then creating the imagery based on
> >>> these items and outputting them to graphics formats in varying
> >>> sizes.
> >>>
> >>> The results get fed to users on demand without the delays associated
> >>> with 'on the fly' image creation.
> >>>
> >>> Any help will be welcomed!
> >>>
> >>> cheers
> >>>
> >>> Biz
> >>>
>



More information about the MapServer-users mailing list