[postgis-users] DB design considerations : geoAttributes vs g eometry

Obe, Regina DND\MIS robe.dnd at cityofboston.gov
Tue Apr 12 07:02:24 PDT 2005


>From my understanding and experience - geometry indexes are much slower than
attribute indexes.  It really depends on how large your dataset is, how
powerful your server is, and how many hits you get as to whether this is
worthwhile to do.

My basic approach is to don't add the redundant attributes and see how slow
it is.  If you get unacceptable speed then add the more commonly used
redundant attributes and create an index against these redundant attributes.



-----Original Message-----
From: Moisan Yves [mailto:ymoisan at groupesm.com] 
Sent: Tuesday, April 12, 2005 9:08 AM
To: postgis-users at postgis.refractions.net
Subject: [postgis-users] DB design considerations : geoAttributes vs
geometry


Hi All,

Maybe a bit OT, but I have this basic DB design question.  I have a bunch of
data sources (sensors, people ...) that will dump data in postGIS.  I am
wondering what the best trade-off is between redundant attribute data on
which I could build indexes and just using the_geom to build an index.

Example :

Station 1 : the_geom; [geoInfoColumns] (redundant attribute data);
dataColumns 

Where 

- geoInfoColumns = string {country; province; adminRegion; city ...} 

- dataColumns = the real data

The_geom and dataColumns are hard data.  The question is about optional
geoInfoColumns that I put in square brackets.  When I upload a new station
location, is it worth adding geoInfoColmuns *to use in filters* or can I
rely on the_geom only ?  

The point is that I can make an arbitrary number of such optional
sub-regions available that I could index on, e.g. "select * from table where
sub-sub-sub-region = 'aRegion'", even though those optional columns really
represent an over-determination in the sense that they are not needed to
locate the station : they would be used to speed up access.

I am not asking to what level the over-determination can go.  That I can
play with.  I am just asking if making the DB much bigger by adding
redundant information in the form of geoInfoColumns attributes is worth it
in terms of access time.  I would rather write utilities to translate
queries filtered by sub-region so that they map on the_geom (e.g. have some
table or function or stroed proc that returns a polygon when I feed it the
region name) rather than add an arbitrary number of attributes (plus I would
avoid being tied to geographical names that are very likely to change with
time, e.g. a political riding or some other administrative region!).

Thanx for any (basic!) pointers.

Yves Moisan


_______________________________________________
postgis-users mailing list postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list