[postgis-users] speeding up simple pt-in-poly lookups

Tue Dec 20 22:01:38 PST 2011

Try this:

Create a table with chopped up polygons derived from your continents table.

Say you have continents (continent_id, name, the_geom)

then you have continents_chopped(chop_id, continent_id, continent_name, the_geom)

Spatial index on  continents_chopped.polygon

SELECT Count(distinct continent_id)
       FROM c, continents_chopped n
       WHERE ST_Intersects(c.the_geom, n.the_geom) AND
                n.continent = 'North America';

the BB index will very quickly identify a few chopped polygons which contain the point, then the much slower test against all the vertices in the continent polygons is carried out against a few chopped polygons with vastly fewer vertices, making for a much faster operation. 

This, from an email I sent a colleague a few weeks ago may be helpful:

As far as an easy way to generate a grid of specified size cells over a specified area, see:

http://trac.osgeo.org/postgis/wiki/UsersWikiCreateFishnet

Thus allowing arbitrary slicing/binning of polygons via a simple overlay operation - ST_Intersection()

This can be wrapped up in a shell script which generates the grid, as below: 

#! /bin/bash

# script to generate an arbitrary grid

# test only

# B Wood 12/11/2011

DB=test_grid

# create db

dropdb $DB

createdb $DB

psql -d $DB -qf /usr/share/postgresql/9.1/contrib/postgis-1.5/postgis.sql

psql -d $DB -qf /usr/share/postgresql/9.1/contrib/postgis-1.5/spatial_ref_sys.sql

# create function

psql -d $DB -c "CREATE OR REPLACE FUNCTION ST_CreateGrid(

        nrow integer, ncol integer,

        xsize float8, ysize float8,

        x0 float8 DEFAULT 0, y0 float8 DEFAULT 0,

       srid integer DEFAULT 4326)

    RETURNS SETOF geometry AS

\$\$

SELECT ST_Translate(cell, j * \$3 + \$5, i * \$4 + \$6)

FROM generate_series(0, \$1 - 1) AS i,

     generate_series(0, \$2 - 1) AS j,

(

SELECT setsrid(('POLYGON((0 0, 0 '||\$4||', '||\$3||' '||\$4||', '||\$3||' 0,0 0))')::geometry, \$7) AS cell

) AS foo;

\$\$ LANGUAGE sql IMMUTABLE STRICT;"

# create table

psql -d $DB -c "create table test_grid

                  ( id     serial  primary key);"

psql -d $DB -c "select 

                addgeometrycolumn('','test_grid','geom',4326,'POLYGON',2);"

# create lat/long grid

# set extent

W=150

E=190

S=-55
N=-30

SRID=4326

# set cell size in degrees

SIZE=0.05

# get cell counts for extent

NX=`echo "($E - $W) / $SIZE" | bc`

NY=`echo "($N - $S) / $SIZE" | bc | tr -d "-"`

# create grid

psql -d $DB -c "delete from test_grid;"

psql -d $DB -c "insert into test_grid (geom) 

                  select ST_CreateGrid($NY,$NX,$SIZE,$SIZE,$W.0,$S.0,$SRID);"

--- On Wed, 12/21/11, Puneet Kishor <punk.kish at gmail.com> wrote:

From: Puneet Kishor <punk.kish at gmail.com>
Subject: Re: [postgis-users] speeding up simple pt-in-poly lookups
To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
Date: Wednesday, December 21, 2011, 2:28 PM

On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote:

> Chop up the continents into smaller pieces.
> 

hmmm... I am not sure I understand the above. And then what? UNION each smaller piece query? 

> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor <punk.kish at gmail.com> wrote:
>> This is probably a really basic question... my ST_Within or ST_Intersects selecting points in a continent are way too slow (both take upward of 200 secs).
>> 
>>        SELECT Count(c_id)
>>        FROM c, continents n
>>        WHERE ST_Intersects(c.the_geom, n.the_geom) AND
>>                n.continent = 'North America';
>> 
>> 
>> Both tables have gist indexes on the geometries. The above query has the following plan
>> 
>> "Aggregate  (cost=9.66..9.67 rows=1 width=4)"
>> "  ->  Nested Loop  (cost=0.00..9.66 rows=1 width=4)"
>> "        Join Filter: _st_intersects(c.the_geom, n.the_geom)"
>> "        ->  Seq Scan on continents n  (cost=0.00..1.10 rows=1 width=32)"
>> "              Filter: ((continent)::text = 'North America'::text)"
>> "        ->  Index Scan using pbdb__collections_the_geom on collections c  (cost=0.00..8.30 rows=1 width=104)"
>> "              Index Cond: (c.the_geom && n.the_geom)"
>> 
>> The table c has approx 120K rows, and the continents table has 8 rows.Suggestions on how I can improve this? Yes, the computer is otherwise very swift and modern.
>> 
>> 
>> 
>> --
>> Puneet Kishor

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20111220/5330c9b8/attachment.html>