[postgis-users] Tricks to find polygon/line intersection faster

Evan Martin postgresql at realityexists.net
Mon Jul 8 08:36:00 PDT 2013


Thanks, Steve, that seems to do the trick. Of course the results change 
a bit, so it's a trade-off of accuracy vs. speed. I presume the change 
is because I do the tiling on the plane - ST_Intersection(geom, geom). 
When I tried doing tiling on geography the results changed much more 
(compared to no tiling). Would be interesting to understand why that is. 
Am I doing something wrong? I create a grid of 1x1 degree polygons and 
then do this:

     SELECT poly_id, ST_Intersection(poly_border::geometry, 
tile)::geography AS poly_tile
     FROM my_polygon p
     JOIN world_tile_1 t ON ST_Intersects(p.border::geometry, t.tile)

The intersection with lines is then done on geography, as before. I only 
do this for polygons that don't span the dateline (which is 99% of them, 
luckily).

Evan

On 06.07.2013 21:19, Stephen Woodbridge wrote:
> The standard way of dealing this this is to chop you really large 
> polygons into tiles. Or if the multipolygons can be split into 
> multiple individual polygons you might get better performance.
>
> google: postgis tiling large polygons
>
> if you need the distance that the line intersects the multiple tiles 
> or multiple split multipolygons you will need to sum() and group on 
> the original id of the split object.
>
> -Steve
>
> On 7/6/2013 1:10 PM, Evan Martin wrote:
>> It's not really "many large things vs many large things". Most lines are
>> < 100 km long (but there are some over 1000 km). Here's a percentile
>> chart: https://imageshack.us/a/img16/940/w5s.png
>>
>> Most of the polygons are also quite small and simple, but there are a
>> few really large complex ones. From my testing it looks like a few of
>> the "worst" polygons (multi-polygons, actually) take all the time, so
>> that 25,000 count was a bit misleading. 96% of them have < 100 points,
>> but the worst one has > 23,000. I couldn't get the area, because
>> ST_Area(geog) is returning some ridiculously high numbers, but it would
>> be millions of sq km.
>>
>> On 06.07.2013 5:48, Paul Ramsey wrote:
>>> Without seeing your data it's quite hard to say. Many large things vs
>>> many large things yields a problem where indexes and so on don't have
>>> a lot of leverage on the problem.
>>>
>>> On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
>>> <postgresql at realityexists.net> wrote:
>>>> Hi,
>>>>
>>>> I have tables of ~25,000 polygons and ~80,000 lines and I want to
>>>> find which
>>>> lines intersect which polygons using PostGIS 2.1. Both are
>>>> geographies and
>>>> can span the dateline. Doing this the simple way using
>>>> ST_Intersects(geog,
>>>> geog) takes about 3 hours on my machine and I'd to see if there's a
>>>> way to
>>>> speed this up.
>>>>
>>>> I already have indexes on the geography columns and one of them is 
>>>> being
>>>> used (the one on the lines). Each line only has 2 points, but the
>>>> polygons
>>>> have anywhere from 4 to 20,000 points and some of them are very
>>>> large. It
>>>> would be OK to miss some of the smaller intersections (ie. where 
>>>> the two
>>>> only just barely intersect), but I wouldn't want the query to return
>>>> false
>>>> positives. In fact, ideally, I'd like to find only the lines that
>>>> "substantially" intersect a polygon, eg. at least x km or x% of the
>>>> line is
>>>> in the polygon, but finding any intersections at all would be a start.
>>>>
>>>> One trick I tried is ST_SimplifyPreserveTopology. I used that to 
>>>> create
>>>> simplified version of the polygons (at least those that don't span the
>>>> dateline) and check those first, then if they intersect then check
>>>> the real
>>>> polygons. This seems to work, but the performance gains are marginal
>>>> compared to the simple approach.
>>>>
>>>> Is there another trick I can use to do this faster? I know
>>>> ST_Intersects()
>>>> internally calls ST_Distance(), which calculates the distance to a
>>>> fraction
>>>> of a metre. I don't need that kind of precision, so surely there's 
>>>> some
>>>> "shorcut" to be found?
>>>>
>>>> Thanks,
>>>>
>>>> Evan
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at lists.osgeo.org
>>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at lists.osgeo.org
>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users



More information about the postgis-users mailing list