[postgis-users] using "contains" in postgis
Bill Binko
bill at binko.net
Wed Feb 22 13:46:52 PST 2006
John Taber wrote:
> I posted a question the other day regarding using "contains" with a
> join - no one addressed it so I'm wondering if it is a good way to go
> or should I be using another approach. Basically I'm just trying to
> join an object with another object that is contained in it so
> attributes can be shared. Thks for any help.
>
Well, you certainly _can_ do what you're suggesting. The question is
_why_... if you can provide some insight into that, we can probably find
better solutions.
For example, let's say you want to show all of the information about a
parcel. However, some of that information is maintained at the
census_tract level, and others are at the state level.
You CAN do this like this:
select parcels.*, states.state_level_data, tracts.tract_level_data from
parcels, tracts, states where
tracts.tract_shape && parcels.parcel_shape AND
contains(tracts.tract_shape, parcels.parcel_shape) AND
state.state_shape && parcels.parcel_shape AND
contains(state.state_shape, parcels.parcel_shape) and parcel.key =
'someKey';
However, it would almost always be a horrificly slow join. (Note that
you must include the operators AND the contains() as only the operators
will use the GiST indexes -- and no, I don't know if other operators are
more effective -- anyone?)
You are much better off in this case (where you know the heirarchy of
your data in advance) to either:
1) Denormalize your data
or
2) Create non-spatial foreign keys.
So, you'd either want to do something like this:
ALTER TABLE parcles add column state_level_data;
UPDATE parcels set state_level_data = (SELECT state_level_data from
states where state.state_shape && parcel_shape AND
contains(state.state_shape, parcel_shape LIMIT 1) ; -- (assume you
will take the first if it's in multiple states... unlikey)
and something similar for tracts... then you'd have the attributes on
your parcels, and your queries would be faster since there would be no
joins at all.
OR
ALTER TABLE parcels ADD COLUMN state_id;
UPDATE parcels set state_id = (SELECT state_id from states where
state.state_shape && parcel_shape AND
contains(state.state_shape, parcel_shape LIMIT 1) ;
-- (should proabably create foreign key constraint here, index, etc.)
Now, you stay normalized, but all of your joins are on non-spatial
fields. You just have to update those links whenever your shapes move.
If you have dynamic data, these approaches get more "interesting", but
are still doable. Postgresql has excellent rule and trigger mechanism
helping to keep this sane.
If your needs are more dynamic, or if there isn't such a strong
well-known heirarchy as this example, post the details (at the domain
level, not as a proposed solution) and we'll help you find one that works.
Bill
More information about the postgis-users
mailing list