[postgis-users] using "contains" in postgis

Bill Binko bill at binko.net
Wed Feb 22 13:46:52 PST 2006


John Taber wrote:

> I posted a question the other day regarding using "contains" with a 
> join - no one addressed it so I'm wondering if it is a good way to go 
> or should I be using another approach.  Basically I'm just trying to 
> join an object with another object that is contained in it so 
> attributes can be shared.  Thks for any help.
>
Well, you certainly _can_ do what you're suggesting.  The question is 
_why_... if you can provide some insight into that, we can probably find 
better solutions.

For example, let's say you want to show all of the information about a 
parcel.  However, some of that information is maintained at the 
census_tract level, and others are at the state level.

You CAN do this like this:

select parcels.*, states.state_level_data, tracts.tract_level_data  from 
parcels, tracts, states where
    tracts.tract_shape && parcels.parcel_shape AND
    contains(tracts.tract_shape, parcels.parcel_shape) AND
    state.state_shape && parcels.parcel_shape AND
    contains(state.state_shape, parcels.parcel_shape) and parcel.key = 
'someKey';

However, it would almost always be a horrificly slow join.  (Note that 
you must include the operators AND the contains() as only the operators 
will use the GiST indexes -- and no, I don't know if other operators are 
more effective -- anyone?)

You are much better off in this case (where you know the heirarchy of 
your data in advance) to either:
1) Denormalize your data
or
2) Create non-spatial foreign keys.

So, you'd either want to do something like this:
ALTER TABLE parcles add column state_level_data;
UPDATE parcels set state_level_data = (SELECT state_level_data from 
states where state.state_shape && parcel_shape AND
    contains(state.state_shape, parcel_shape LIMIT 1) ;  -- (assume you 
will take the first if it's in multiple states... unlikey)

and something similar for tracts... then you'd have the attributes on 
your parcels, and your queries would be faster since there would be no 
joins at all.

OR

ALTER TABLE parcels ADD COLUMN state_id;
UPDATE parcels set state_id = (SELECT state_id from states where 
state.state_shape && parcel_shape AND
    contains(state.state_shape, parcel_shape LIMIT 1) ; 
-- (should proabably create foreign key constraint here, index, etc.)

Now, you stay normalized, but all of your joins are on non-spatial 
fields.  You just have to update those links whenever your shapes move.

If you have dynamic data, these approaches get more "interesting", but 
are still doable.  Postgresql has excellent rule and trigger mechanism 
helping to keep this sane.

If your needs are more dynamic, or if there isn't such a strong 
well-known heirarchy as this example, post the details (at the domain 
level, not as a proposed solution) and we'll help you find one that works.

Bill










More information about the postgis-users mailing list