[postgis-users] speeding up simple pt-in-poly lookups

Nicklas Avén nicklas.aven at jordogskog.no
Tue Dec 20 23:39:38 PST 2011


Martin, Stephen

In intersect testing the winding number algorithm is used. Stabbing line
algorithm is used in distance calculations in PostGIS.


Puneet
Is it geometry type you are using or is it geography. I have no idea
about the algorithms in geography functions.

It sounds very much with 200 s, your continent polygons must be very
detailed with many vertex points.

What do you get from 
SELECT ST_NPoints(the_geom) from continents;
?


/Nicklas

On Tue, 2011-12-20 at 23:15 -0800, Martin Davis wrote:
> Yes, this so-called stabbing line algorithm for Point-In-Polygon is 
> already implemented in the GEOS PreparedGeometry code used by PostGIS 
> (which I assume is being invoked in this query - Paul, can you 
> confirm?)  In fact, the PreparedGeometry algorithm goes one better and 
> uses a 1-D spatial index on the edges in order to quickly discard edges 
> which cannot intersect the stabbing line (the standard algorithm still 
> requires an O(n) scan of all the edges)
> 
> However, I think I'm correct in saying that even though the 
> PreparedGeometry is cached in memory it is still necessary to detoast 
> the large continent geometry from disk into memory for each point which 
> is being queried, in order to confirm the cache hit. This is due to the 
> way the Postgres query engine works.  The detoasting dominates the time, 
> even though the actual Point-In-Polygon test is pretty efficient.  (Paul 
> can perhaps confirm this too).
> 
> It would be nice to find some more efficient strategy for testing for a 
> positive cache hit....
> 
> On 12/20/2011 9:54 PM, Stephen Woodbridge wrote:
> > Hi Martin,
> >
> > There is a very fast algorithm for testing point in polygon by 
> > creating a vector from the point to another point beyond the extents 
> > of the polygon and counting the number of edge crossings. If the count 
> > is odd then the point is in the polygon and if the count is even it is 
> > not. If the vector is horizontal you can very rapidly reject all edges 
> > above or below the vector and all edges east or west of the vector. 
> > Then a simple intersection test on each of the remain edges will 
> > determine the count.
> >
> > Given your extensive knowledge in computational geometry, I assume you 
> > already are aware of this algorithm and it is not suitable for some 
> > reason. But since this question on performance comes up pretty often, 
> > it might have a place as long as people know what the potential 
> > limitations are.
> >
> > -Steve
> >
> > On 12/20/2011 10:48 PM, Martin Davis wrote:
> >> For more detail check out this thread on the same issue:
> >>
> >> http://postgis.refractions.net/pipermail/postgis-users/2011-November/031345.html 
> >>
> >>
> >>
> >> On 12/20/2011 5:28 PM, Puneet Kishor wrote:
> >>> On Dec 20, 2011, at 7:21 PM, Paul Ramsey wrote:
> >>>
> >>>> Chop up the continents into smaller pieces.
> >>>>
> >>>
> >>> hmmm... I am not sure I understand the above. And then what? UNION
> >>> each smaller piece query?
> >>>
> >>>
> >>>> On Tue, Dec 20, 2011 at 4:35 PM, Puneet Kishor<punk.kish at gmail.com>
> >>>> wrote:
> >>>>> This is probably a really basic question... my ST_Within or
> >>>>> ST_Intersects selecting points in a continent are way too slow (both
> >>>>> take upward of 200 secs).
> >>>>>
> >>>>> SELECT Count(c_id)
> >>>>> FROM c, continents n
> >>>>> WHERE ST_Intersects(c.the_geom, n.the_geom) AND
> >>>>> n.continent = 'North America';
> >>>>>
> >>>>>
> >>>>> Both tables have gist indexes on the geometries. The above query has
> >>>>> the following plan
> >>>>>
> >>>>> "Aggregate (cost=9.66..9.67 rows=1 width=4)"
> >>>>> " -> Nested Loop (cost=0.00..9.66 rows=1 width=4)"
> >>>>> " Join Filter: _st_intersects(c.the_geom, n.the_geom)"
> >>>>> " -> Seq Scan on continents n (cost=0.00..1.10 rows=1 width=32)"
> >>>>> " Filter: ((continent)::text = 'North America'::text)"
> >>>>> " -> Index Scan using pbdb__collections_the_geom on collections c
> >>>>> (cost=0.00..8.30 rows=1 width=104)"
> >>>>> " Index Cond: (c.the_geom&& n.the_geom)"
> >>>>>
> >>>>> The table c has approx 120K rows, and the continents table has 8
> >>>>> rows.Suggestions on how I can improve this? Yes, the computer is
> >>>>> otherwise very swift and modern.
> >>>>>
> >>>>>
> >>>>>
> >>>>> -- 
> >>>>> Puneet Kishor
> >>> _______________________________________________
> >>> postgis-users mailing list
> >>> postgis-users at postgis.refractions.net
> >>> http://postgis.refractions.net/mailman/listinfo/postgis-users
> >>>
> >>>
> >>> -----
> >>> No virus found in this message.
> >>> Checked by AVG - www.avg.com
> >>> Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 
> >>> 12/20/11
> >>>
> >>>
> >> _______________________________________________
> >> postgis-users mailing list
> >> postgis-users at postgis.refractions.net
> >> http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
> >
> > -----
> > No virus found in this message.
> > Checked by AVG - www.avg.com
> > Version: 2012.0.1890 / Virus Database: 2109/4692 - Release Date: 12/20/11
> >
> >
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 





More information about the postgis-users mailing list