[postgis-users] Tricks to find polygon/line intersection faster

Evan Martin postgresql at realityexists.net
Sat Jul 6 10:10:24 PDT 2013


It's not really "many large things vs many large things". Most lines are 
< 100 km long (but there are some over 1000 km). Here's a percentile 
chart: https://imageshack.us/a/img16/940/w5s.png

Most of the polygons are also quite small and simple, but there are a 
few really large complex ones. From my testing it looks like a few of 
the "worst" polygons (multi-polygons, actually) take all the time, so 
that 25,000 count was a bit misleading. 96% of them have < 100 points, 
but the worst one has > 23,000. I couldn't get the area, because 
ST_Area(geog) is returning some ridiculously high numbers, but it would 
be millions of sq km.

On 06.07.2013 5:48, Paul Ramsey wrote:
> Without seeing your data it's quite hard to say. Many large things vs
> many large things yields a problem where indexes and so on don't have
> a lot of leverage on the problem.
>
> On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
> <postgresql at realityexists.net> wrote:
>> Hi,
>>
>> I have tables of ~25,000 polygons and ~80,000 lines and I want to find which
>> lines intersect which polygons using PostGIS 2.1. Both are geographies and
>> can span the dateline. Doing this the simple way using ST_Intersects(geog,
>> geog) takes about 3 hours on my machine and I'd to see if there's a way to
>> speed this up.
>>
>> I already have indexes on the geography columns and one of them is being
>> used (the one on the lines). Each line only has 2 points, but the polygons
>> have anywhere from 4 to 20,000 points and some of them are very large. It
>> would be OK to miss some of the smaller intersections (ie. where the two
>> only just barely intersect), but I wouldn't want the query to return false
>> positives. In fact, ideally, I'd like to find only the lines that
>> "substantially" intersect a polygon, eg. at least x km or x% of the line is
>> in the polygon, but finding any intersections at all would be a start.
>>
>> One trick I tried is ST_SimplifyPreserveTopology. I used that to create
>> simplified version of the polygons (at least those that don't span the
>> dateline) and check those first, then if they intersect then check the real
>> polygons. This seems to work, but the performance gains are marginal
>> compared to the simple approach.
>>
>> Is there another trick I can use to do this faster? I know ST_Intersects()
>> internally calls ST_Distance(), which calculates the distance to a fraction
>> of a metre. I don't need that kind of precision, so surely there's some
>> "shorcut" to be found?
>>
>> Thanks,
>>
>> Evan
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users



More information about the postgis-users mailing list