[postgis-users] Tricks to find polygon/line intersection faster

Stephen Woodbridge woodbri at swoodbridge.com
Mon Jul 8 08:52:17 PDT 2013


Evan,

I don't work with geography types that much but I'm guessing the error 
is a result of the fact that your 1 degree edges are straight lines in 
geographic projection (ie: unprojected lat/lon) but when you use a 
geographic they really should be curved lines but the edge has only the 
end points so they get projected internally to the geography projection 
and the edge is a straight line between the endpoints rather than a 
curve, hence the error. If your lines are long with just the endpoints 
also that will generate an error like this too. If you add points along 
the edges and along your lines then when they are projected by the 
geography functions they will more accurately follow the curve in space 
that you would expect them to. Look at function ST_Segmentize().

If I'm wrong about how this works, someone jump in here please.

-Steve

On 7/8/2013 11:36 AM, Evan Martin wrote:
> Thanks, Steve, that seems to do the trick. Of course the results change
> a bit, so it's a trade-off of accuracy vs. speed. I presume the change
> is because I do the tiling on the plane - ST_Intersection(geom, geom).
> When I tried doing tiling on geography the results changed much more
> (compared to no tiling). Would be interesting to understand why that is.
> Am I doing something wrong? I create a grid of 1x1 degree polygons and
> then do this:
>
>      SELECT poly_id, ST_Intersection(poly_border::geometry,
> tile)::geography AS poly_tile
>      FROM my_polygon p
>      JOIN world_tile_1 t ON ST_Intersects(p.border::geometry, t.tile)
>
> The intersection with lines is then done on geography, as before. I only
> do this for polygons that don't span the dateline (which is 99% of them,
> luckily).
>
> Evan
>
> On 06.07.2013 21:19, Stephen Woodbridge wrote:
>> The standard way of dealing this this is to chop you really large
>> polygons into tiles. Or if the multipolygons can be split into
>> multiple individual polygons you might get better performance.
>>
>> google: postgis tiling large polygons
>>
>> if you need the distance that the line intersects the multiple tiles
>> or multiple split multipolygons you will need to sum() and group on
>> the original id of the split object.
>>
>> -Steve
>>
>> On 7/6/2013 1:10 PM, Evan Martin wrote:
>>> It's not really "many large things vs many large things". Most lines are
>>> < 100 km long (but there are some over 1000 km). Here's a percentile
>>> chart: https://imageshack.us/a/img16/940/w5s.png
>>>
>>> Most of the polygons are also quite small and simple, but there are a
>>> few really large complex ones. From my testing it looks like a few of
>>> the "worst" polygons (multi-polygons, actually) take all the time, so
>>> that 25,000 count was a bit misleading. 96% of them have < 100 points,
>>> but the worst one has > 23,000. I couldn't get the area, because
>>> ST_Area(geog) is returning some ridiculously high numbers, but it would
>>> be millions of sq km.
>>>
>>> On 06.07.2013 5:48, Paul Ramsey wrote:
>>>> Without seeing your data it's quite hard to say. Many large things vs
>>>> many large things yields a problem where indexes and so on don't have
>>>> a lot of leverage on the problem.
>>>>
>>>> On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
>>>> <postgresql at realityexists.net> wrote:
>>>>> Hi,
>>>>>
>>>>> I have tables of ~25,000 polygons and ~80,000 lines and I want to
>>>>> find which
>>>>> lines intersect which polygons using PostGIS 2.1. Both are
>>>>> geographies and
>>>>> can span the dateline. Doing this the simple way using
>>>>> ST_Intersects(geog,
>>>>> geog) takes about 3 hours on my machine and I'd to see if there's a
>>>>> way to
>>>>> speed this up.
>>>>>
>>>>> I already have indexes on the geography columns and one of them is
>>>>> being
>>>>> used (the one on the lines). Each line only has 2 points, but the
>>>>> polygons
>>>>> have anywhere from 4 to 20,000 points and some of them are very
>>>>> large. It
>>>>> would be OK to miss some of the smaller intersections (ie. where
>>>>> the two
>>>>> only just barely intersect), but I wouldn't want the query to return
>>>>> false
>>>>> positives. In fact, ideally, I'd like to find only the lines that
>>>>> "substantially" intersect a polygon, eg. at least x km or x% of the
>>>>> line is
>>>>> in the polygon, but finding any intersections at all would be a start.
>>>>>
>>>>> One trick I tried is ST_SimplifyPreserveTopology. I used that to
>>>>> create
>>>>> simplified version of the polygons (at least those that don't span the
>>>>> dateline) and check those first, then if they intersect then check
>>>>> the real
>>>>> polygons. This seems to work, but the performance gains are marginal
>>>>> compared to the simple approach.
>>>>>
>>>>> Is there another trick I can use to do this faster? I know
>>>>> ST_Intersects()
>>>>> internally calls ST_Distance(), which calculates the distance to a
>>>>> fraction
>>>>> of a metre. I don't need that kind of precision, so surely there's
>>>>> some
>>>>> "shorcut" to be found?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Evan
>>>>>
>>>>> _______________________________________________
>>>>> postgis-users mailing list
>>>>> postgis-users at lists.osgeo.org
>>>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at lists.osgeo.org
>>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at lists.osgeo.org
>>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users



More information about the postgis-users mailing list