[Mapserver-users] PostGIS / Shapefile Performance Question

Paul Ramsey pramsey at refractions.net
Wed Jan 15 13:12:14 EST 2003


David,

Early in the development of the PostGIS / Mapserver connector we did 
some benchmarking of PostGIS against Shape files.

Shape files will be faster than PostGIS for simple map drawing 
applications in almost every case.  The rendering step is going to be 
the same regardless of data source.  That leaves data access, and an 
indexed shapefile will always have slightly lower overhead than an 
indexed spatial table for a simple spatial bounding rectangle query.  We 
found that the speed difference was lowest what the number of features 
was smallest. Ie, for drawing a map with only 3 features, selected out 
of a table of 300000, the PostGIS layer took less than 10% longer (on a 
scale measured in 1/100s of a second, mind you :).  For drawing maps 
with more (several thousand) features, the PostGIS overhead got as high 
as 20-30%.

(Note that all the statements above assume you have build an index on 
your shape files.  It is interesting to note that ESRI has never put out 
a means of spatially indexing shape files, and as a result there is a 
kind of collective brain melt in our field which says "shape files are 
'too slow' for web mapping". This is true with ArcIMS (no spatial index) 
but not true with Mapserver.)

So why use PostGIS at all?
Several reasons:
- large shape file archives can be hard to manage if the data changes 
regularly
- if you have an interactive site which allows online updates then 
concurrent shapefile writing could cause data corruption as well as 
indexes going out of sync with the underlying data
- you can do complex multi-table queries much faster PostGIS than with 
shape files
- you can do attribute-based queries much faster with PostGIS than with 
shape files (because shape files lack an index on the attributes)
- you can use your PostGIS/PostgreSQL system as a full corporate data 
repository, storing your business attributes and spatial objects in the 
same data schema, managing the different aspects of the data with many 
different tools, using standard access methods like JDBC and ODBC

Lowther, David W wrote:

> Are there certain situations in which the access to PostGIS might be quicker
> then a shapefile, say when zoomed in closely or zoomed way out or when doing
> a point based query?

No. In the very-zoomed-in case the performance will be almost identical, 
but never faster.

> Is there a point where the number of features in a layer would cause PostGIS
> or shapefiles to perform better?

The PostGIS r-tree index might end up more balanced than the shape file 
quadtree for certain kinds of spatial data.  At larger archive sizes it 
is possible that this might result in a noticable performance win.  I 
cannot give a concrete example however.

> What if I put a monster of a machine in place as the postgres server? Could
> I build a postgres server that would be as fast as shapefiles local to
> mapserver?

Well, if you give your postgis database more oomf to read through the 
data, you might make it faster than your poor little mapserver 
read-and-render machine, but it hardly seems fair to make the 
comparison. If you are buying a monster machine you could just run the 
read-and-render mapserver on it, and your shape files would still be faster.

> What happens as the application scales? If I saw traffic like mapquest.com
> or something would shapefiles be faster than PostGIS?

Properly laid out shape files, with a tiling system and spatial indexes, 
should always be faster. If your data changes regularly such a layout 
might not be managable however, or might be most easily managed with a 
hybrid system (store working data in PostGIS and snap out a copy for 
mapserver to read-and-render from on a nightly basis).

> Sorry if this seems irrelevant or silly line of questions. I just have a
> conflict between the convenience / queryability of PostGIS and the speed of
> shapefiles.

It is not irrelevant at all. Understanding the tradeoffs (and there 
*are* tradeoffs for both options) is the core of good systems design. I 
hope I have provided some useful information.

Paul

-- 
       __
      /
      | Paul Ramsey
      | Refractions Research
      | Email: pramsey at refractions.net
      | Phone: (250) 885-0632
      \_




More information about the mapserver-users mailing list