[postgis-users] Points-in-polygon performance

Brent Wood pcreso at pcreso.com
Fri Dec 21 13:39:42 PST 2007


--- Robin Chauhan <robin.chauhan at gmail.com> wrote:

Hi Robin,

I've had similar issues, which I've used your (a) option to help with, but also
a (d) for you to consider, more a data warehousing approach.... 

I'm assuming you have spatial indices on all your point and polygon columns?


If your data is reasonably static, you can pre-calculate values & query on
those instead of the intersects option. eg, a boolean column called intersect
which contains the value of "intersects(geom_fld, rpg_geom) and
rpg_name='GVRD_OLD'".

If your data is more dynamic, an insert trigger could determine & set this
value for you when records are inserted. In your query, the where clause
becomes "...and intersect = true..." rather than the spatial join. This is
probably the fastest way, but if your data changes much, it runs the risk of
the insert column values becoming out of sync with the actual data, & therefore
returning wrong answers. How likely this is depends on your data, & how often
it changes. Generally spatial data is static (fleet monitoring applications
excepted :-)

Similar to the boolean approach above, which works well for a single polygon,
I've also used pre-calculated columns for point features, which include the
ID's (or names if they are unique) of polygons they lie within, in a specified
layer. 

Thus the point table has a column for each polygon table with which it is
frequently spatially joined. (note, this simple approach only works if the
polygons are non-overlapping), eg, simplistically & from memory;

Take a table of points called point & 2 tables of polygons called stratum_2006
& stratum_2007 which are frequently spatially joined to the point table, much
as in your example.

Add s_2006 & s_2007 columns to the point table. Run something like this:
update point p 
set s_2006=(select name from stratum_2006 s 
            where intersects(p.geom, s.geom));

(& make sure you index the s_2006 & s_2007 columns)

then repeat for 2007 strata. The points table can now be queried to return
points in a particular polygon (stratum) without any join to the stratum table
needed, just: "select * from points where s_2007='GVRD_OLD'....;"


HTH,

  Brent Wood



> I have a region_polygons table where region_polygons.rpg_geom is a
> MULTIPOLYGON.
> I want to select rows in the geom_table where geom_table.geom_fld is within
> region_polygons.rpg_geom
> 
> The following idiom works for me, but is waaaay slow (whereas I found other
> kinds of spatial queries are pretty zippy):
> 
> SELECT * from geom_table where EXISTS
>                 (select rpg_id from region_polygons where
>                 geom_fld && rpg_geom and intersects(geom_fld, rpg_geom) and
> rpg_name='GVRD_OLD')
> 
> Where should I start?
> 
> a) Simplify the multipolygon?
> b) Joins instead of EXISTS?
> c) Would putting the literal WK format for the polygon right into the SQL
> query help?
> d) ... or something else?




More information about the postgis-users mailing list