[mapserver-users] Re: PostGIS vs Shapefile [was: How to make MapServer WMS super fast?]
pcreso at pcreso.com
pcreso at pcreso.com
Sat Dec 13 11:16:13 PST 2008
Hi,
One quick comment, I haven't seen any suggestion of raster pyramids or having a few zoom layers of the rasters pre-built to improve mapserver performance. This is one area where performance can be significantly enhanced pretty easily.
And onto Maciej's question, I figure I perhaps should expand somewhat on what I meant... all based on my experiences with mapserver, shapefiles & Postgis to date...
Breaking a large shapefile up into spatial tiles will improve performance when zooming in as many of the tiles are not read, but when zoomed out, you have many shapefiles to process instead of just one. The fix for this is to use zoom (scale based) layers, with fewer large tiles of reduced precision data when zoomed out & many small high precision tiles when zoomed in. So much like zoom layers for rasters, you can build zoom layers for shapefiles.
I believe this sort of optimisation can be managed more easily in Postgis, here are some examples I've used:
a lake "layer" comprising several mapserver layers as a zoom based group implemented on an underlying postgis table as something like:
select lakes from t_lake where lake_area > 1000 for small scale/zoom out
select lakes from t_lake where lake_area > 100 for med scale
select lakes from t_lake where lake_area > 10 for lger scale zoomed in
Thus a single table provides several zoom layers. If you only have a few lakes & zoom layers, this can be done just about as fast with shapefiles, but when you have 100,000's of lakes, for example, the database can return a small subset (based on a well indexed query) to be rendered much faster than a shapefile can.
A bathymetry contour "layer" where the number & precision of the contours is modified depending on scale (to support a WFS service providing roughly constant data volumes irrespective of scale - still a work in progress), eg
select simplify(the_geom, 10000) from bathy_contours where depth in
(select depth from contour_groups where scale=10000000)
select simplify(the_geom, 1000) from bathy_contours where depth in
(select depth from contour_groups where scale=1000000)
select simplify(the_geom, 100) from bathy_contours where depth in
(select depth from contour_groups where scale=100000)
select the_geom from bathy_contours
-- where scale < 50000
The contour_groups table allows me to define which contours are plotted at each scale, so perhaps (250, 500, 1000, 2000, 5000) when zoomed out, or
(100, 250, 500, 750, 1000, 1500, 2000, 3000, 4000, 5000) as you zoom in
to all avaiable contours for large scale (zoomed in) maps. (I can even gain a few milliseconds by hard coding the depths in the sql instead of using the subquery if necessary :-)
The Postgis server queried by mapserver is not the one running mapserver & holding the rasyers in the application, but is a separate db server on a Gb LAN, thus distributing the load somewhat.
The underlying dataset was originally a 150Mb shapefile, which as a WFS XML file became about 2Gb of text, a bit excessive for WFS connections at 10mb. By using simplify() & reducing the numbers of contours, we are able to generate a reasonable map at any scale without significantly changing the volume of data transmitted.
The performsnce of the Postgis simplify() (Peuker-Douglas) function is exemplary, so that we have found no need to pre-build the point reduced layers, but can reduce the points dynamically as we need from a single high res dataset in a single table. The index on depth provides a high performance filter on which contour linestrings are processed. The spatial index quickly limits the contour segments to just those required for the extent required.
This use of database functionality to manage data volumes like this is much easier than with shape files, and I have not (yet) had performance issues due to Postgis, or at least none that couldn't be fixed prety easily with some database optimisation, such as clustered indexes, partitioned tables, custom fields with indexed flag values, etc. The resulting application (mapfile) was much simpler & I only have one database to manage, instead of a plethora of versions of shapefiles, which doesn't improve the application performance, but does improve mine :-)
Cheers,
Brent Wood
--- On Sun, 12/14/08, Maciej Sieczka <msieczka at sieczka.org> wrote:
> From: Maciej Sieczka <msieczka at sieczka.org>
> Subject: PostGIS vs Shapefile [was: How to make MapServer WMS super fast?]
> To: pcreso at pcreso.com
> Cc: Jukka.Rahkonen at mmmtike.fi, mapserver-users at lists.osgeo.org
> Date: Sunday, December 14, 2008, 2:31 AM
> pcreso at pcreso.com pisze:
> > Shapefiles seem to be faster when you want to plot
> most of or the
> > whole dataset, but if you want to select attribute or
> spatial based
> > subsets, they can be significantly faster on a well
> indexed Postgis
> > table.
>
> Does this still hold true when appropriate tiling is done
> for
> Shapefiles, e.g. as described on [1]?
>
> [1]http://www.nabble.com/Re:-question-about-shp2tile-p4461869.html
>
> Maciek
>
> -- Maciej Sieczka
> www.sieczka.org
More information about the MapServer-users
mailing list