[postgis-users] Tricks to find polygon/line intersection faster

Evan Martin postgresql at realityexists.net
Tue Feb 11 14:39:45 PST 2014


Thanks, but I realised later that just eliminating the duplicate 
intersections is not enough, because sometimes only part of the 
intersection is doubled and maybe there are other inaccuracies there as 
well. (Perhaps the line running along tile border somehow falls 
"in-between" tiles when re-projected by ST_Intersection?) Anyway, my fix 
was to detect such lines and intersect them with the original polygon 
instead of the tiled one. Since there are relatively few of them it 
doesn't affect the overall time noticeably.

Regards,

Evan

On 11/02/2014 08:39, Rémi Cura wrote:
> Hey,
> geometry equality can be defined in many ways.
>
>
> For your duplication problem, it is a simple postgres problem :
> you want that for any couple (line, poly), you have at most one result 
> (given polygon are convex, which they are if they are squares).
>
> so at the end of you computing you just add a filtering select :
> SELECT DISTINCT ON (line_id,poly_id)
>
> If you don't want random result, you should order your result to know 
> which line of the 4 you get (you could order by length, centroid, 
> first point, whatever.)
>
> WITH (your computing)
> SELECT DISTINCT ON (line_id, poly_id) , poly, line
> FROM your computing
> ORDER BY ST_Length(line) ASC
>
> Now if you want to solve this the PostGIS way it would be harder, as 
> you would need a distance between shape.
> (like http://postgis.net/docs/ST_HausdorffDistance.html)
>
> Of course you could also use implicit distance, for example by 
> snapping your result line to a given precision and doing a regular 
> test after
> (WHERE ST_Equals(ST_SnapToGrid(line1, 0.01),ST_SnapToGrid(line2, 
> 0.01))=TRUE )
>
> Cheers,
>
> Rémi-C
>
>
>
>
> 2014-02-10 21:33 GMT+01:00 Evan Martin <postgresql at realityexists.net 
> <mailto:postgresql at realityexists.net>>:
>
>     I've discovered a slight problem with the handy "tiled
>     intersection" trick suggested earlier: some of my lines run
>     exactly along a meridian or along a parallel and so do the tiles,
>     so those intersections get counted twice! For example,
>     LINESTRING(-18 14.5,-18 15.5) results in the following
>     intersections with a particular tiled polygon
>
>     LINESTRING(-17.9999148001148 14.7502863979935,-17.9998863986648
>     15.0000002498516)
>     LINESTRING(-18.0000851998852 14.7502863979933,-18.0001136013352
>     15.0000002498516)
>
>     LINESTRING(-18.0001136013352 15.0000002498516,-18.0000852013123
>     15.2502965758817)
>     LINESTRING(-17.9998863986648 15.0000002498516,-17.9999147986877
>     15.250296575882)
>
>     Could someone suggest the best (fastest while still accurate) way
>     to filter out such duplicates? As you can see, they're not exactly
>     the same, so ST_Equals() returns false on them.
>
>     Evan
>
>     On 08/07/2013 17:36, Evan Martin wrote:
>
>         Thanks, Steve, that seems to do the trick. Of course the
>         results change a bit, so it's a trade-off of accuracy vs.
>         speed. I presume the change is because I do the tiling on the
>         plane - ST_Intersection(geom, geom). When I tried doing tiling
>         on geography the results changed much more (compared to no
>         tiling). Would be interesting to understand why that is. Am I
>         doing something wrong? I create a grid of 1x1 degree polygons
>         and then do this:
>
>             SELECT poly_id, ST_Intersection(poly_border::geometry,
>         tile)::geography AS poly_tile
>             FROM my_polygon p
>             JOIN world_tile_1 t ON ST_Intersects(p.border::geometry,
>         t.tile)
>
>         The intersection with lines is then done on geography, as
>         before. I only do this for polygons that don't span the
>         dateline (which is 99% of them, luckily).
>
>         Evan
>
>         On 06.07.2013 21 <tel:06.07.2013%2021>:19, Stephen Woodbridge
>         wrote:
>
>             The standard way of dealing this this is to chop you
>             really large polygons into tiles. Or if the multipolygons
>             can be split into multiple individual polygons you might
>             get better performance.
>
>             google: postgis tiling large polygons
>
>             if you need the distance that the line intersects the
>             multiple tiles or multiple split multipolygons you will
>             need to sum() and group on the original id of the split
>             object.
>
>             -Steve
>
>             On 7/6/2013 1:10 PM, Evan Martin wrote:
>
>                 It's not really "many large things vs many large
>                 things". Most lines are
>                 < 100 km long (but there are some over 1000 km).
>                 Here's a percentile
>                 chart: https://imageshack.us/a/img16/940/w5s.png
>
>                 Most of the polygons are also quite small and simple,
>                 but there are a
>                 few really large complex ones. From my testing it
>                 looks like a few of
>                 the "worst" polygons (multi-polygons, actually) take
>                 all the time, so
>                 that 25,000 count was a bit misleading. 96% of them
>                 have < 100 points,
>                 but the worst one has > 23,000. I couldn't get the
>                 area, because
>                 ST_Area(geog) is returning some ridiculously high
>                 numbers, but it would
>                 be millions of sq km.
>
>                 On 06.07.2013 5:48, Paul Ramsey wrote:
>
>                     Without seeing your data it's quite hard to say.
>                     Many large things vs
>                     many large things yields a problem where indexes
>                     and so on don't have
>                     a lot of leverage on the problem.
>
>                     On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin
>                     <postgresql at realityexists.net
>                     <mailto:postgresql at realityexists.net>> wrote:
>
>                         Hi,
>
>                         I have tables of ~25,000 polygons and ~80,000
>                         lines and I want to
>                         find which
>                         lines intersect which polygons using PostGIS
>                         2.1. Both are
>                         geographies and
>                         can span the dateline. Doing this the simple
>                         way using
>                         ST_Intersects(geog,
>                         geog) takes about 3 hours on my machine and
>                         I'd to see if there's a
>                         way to
>                         speed this up.
>
>                         I already have indexes on the geography
>                         columns and one of them is being
>                         used (the one on the lines). Each line only
>                         has 2 points, but the
>                         polygons
>                         have anywhere from 4 to 20,000 points and some
>                         of them are very
>                         large. It
>                         would be OK to miss some of the smaller
>                         intersections (ie. where the two
>                         only just barely intersect), but I wouldn't
>                         want the query to return
>                         false
>                         positives. In fact, ideally, I'd like to find
>                         only the lines that
>                         "substantially" intersect a polygon, eg. at
>                         least x km or x% of the
>                         line is
>                         in the polygon, but finding any intersections
>                         at all would be a start.
>
>                         One trick I tried is
>                         ST_SimplifyPreserveTopology. I used that to create
>                         simplified version of the polygons (at least
>                         those that don't span the
>                         dateline) and check those first, then if they
>                         intersect then check
>                         the real
>                         polygons. This seems to work, but the
>                         performance gains are marginal
>                         compared to the simple approach.
>
>                         Is there another trick I can use to do this
>                         faster? I know
>                         ST_Intersects()
>                         internally calls ST_Distance(), which
>                         calculates the distance to a
>                         fraction
>                         of a metre. I don't need that kind of
>                         precision, so surely there's some
>                         "shorcut" to be found?
>
>                         Thanks,
>
>                         Evan
>
>
>     _______________________________________________
>     postgis-users mailing list
>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20140211/668171dc/attachment.html>


More information about the postgis-users mailing list