<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Thanks, but I realised later that just
      eliminating the duplicate intersections is not enough, because
      sometimes only part of the intersection is doubled and maybe there
      are other inaccuracies there as well. (Perhaps the line running
      along tile border somehow falls "in-between" tiles when
      re-projected by ST_Intersection?) Anyway, my fix was to detect
      such lines and intersect them with the original polygon instead of
      the tiled one. Since there are relatively few of them it doesn't
      affect the overall time noticeably.<br>
      <br>
      Regards,<br>
      <br>
      Evan<br>
      <br>
      On 11/02/2014 08:39, Rémi Cura wrote:<br>
    </div>
    <blockquote
cite="mid:CAJvUf_vJ5E1pAuSbxz-xchv1TMa2HDra4zbc1CyweArWU6XW5w@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div>Hey,<br>
          </div>
          <div>geometry equality can be defined in many ways.<br>
            <br>
            <br>
          </div>
          <div>For your duplication problem, it is a simple postgres
            problem :<br>
          </div>
          <div>you want that for any couple (line, poly), you have at
            most one result (given polygon are convex, which they are if
            they are squares).<br>
            <br>
          </div>
          <div>so at the end of you computing you just add a filtering
            select :<br>
            SELECT DISTINCT ON (line_id,poly_id)<br>
            <br>
          </div>
          <div>If you don't want random result, you should order your
            result to know which line of the 4 you get (you could order
            by length, centroid, first point, whatever.)<br>
            <br>
          </div>
          <div>WITH (your computing)<br>
          </div>
          <div>SELECT DISTINCT ON (line_id, poly_id) , poly, line<br>
          </div>
          <div>FROM your computing<br>
          </div>
          <div>ORDER BY ST_Length(line) ASC<br>
            <br>
          </div>
          <div>Now if you want to solve this the PostGIS way it would be
            harder, as you would need a distance between shape.<br>
          </div>
          <div>(like <a moz-do-not-send="true"
              href="http://postgis.net/docs/ST_HausdorffDistance.html">http://postgis.net/docs/ST_HausdorffDistance.html</a>)<br>
            <br>
          </div>
          <div>Of course you could also use implicit distance, for
            example by snapping your result line to a given precision
            and doing a regular test after <br>
          </div>
          <div>(WHERE ST_Equals(ST_SnapToGrid(line1,
            0.01),ST_SnapToGrid(line2, 0.01))=TRUE )<br>
          </div>
          <div><br>
          </div>
          <div>Cheers,<br>
            <br>
            Rémi-C<br>
          </div>
          <div><br>
          </div>
          <div><br>
          </div>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">2014-02-10 21:33 GMT+01:00 Evan Martin
          <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:postgresql@realityexists.net" target="_blank">postgresql@realityexists.net</a>></span>:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            I've discovered a slight problem with the handy "tiled
            intersection" trick suggested earlier: some of my lines run
            exactly along a meridian or along a parallel and so do the
            tiles, so those intersections get counted twice! For
            example, LINESTRING(-18 14.5,-18 15.5) results in the
            following intersections with a particular tiled polygon<br>
            <br>
            LINESTRING(-17.9999148001148 14.7502863979935,-17.9998863986648
            15.0000002498516)<br>
            LINESTRING(-18.0000851998852 14.7502863979933,-18.0001136013352
            15.0000002498516)<br>
            <br>
            LINESTRING(-18.0001136013352 15.0000002498516,-18.0000852013123
            15.2502965758817)<br>
            LINESTRING(-17.9998863986648 15.0000002498516,-17.9999147986877
            15.250296575882)<br>
            <br>
            Could someone suggest the best (fastest while still
            accurate) way to filter out such duplicates? As you can see,
            they're not exactly the same, so ST_Equals() returns false
            on them.<br>
            <br>
            Evan<br>
            <br>
            On 08/07/2013 17:36, Evan Martin wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              Thanks, Steve, that seems to do the trick. Of course the
              results change a bit, so it's a trade-off of accuracy vs.
              speed. I presume the change is because I do the tiling on
              the plane - ST_Intersection(geom, geom). When I tried
              doing tiling on geography the results changed much more
              (compared to no tiling). Would be interesting to
              understand why that is. Am I doing something wrong? I
              create a grid of 1x1 degree polygons and then do this:<br>
              <br>
                  SELECT poly_id, ST_Intersection(poly_border::geometry,
              tile)::geography AS poly_tile<br>
                  FROM my_polygon p<br>
                  JOIN world_tile_1 t ON ST_Intersects(p.border::geometry,
              t.tile)<br>
              <br>
              The intersection with lines is then done on geography, as
              before. I only do this for polygons that don't span the
              dateline (which is 99% of them, luckily).<br>
              <br>
              Evan<br>
              <br>
              On <a moz-do-not-send="true" href="tel:06.07.2013%2021"
                value="+33607201321" target="_blank">06.07.2013 21</a>:19,
              Stephen Woodbridge wrote:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                The standard way of dealing this this is to chop you
                really large polygons into tiles. Or if the
                multipolygons can be split into multiple individual
                polygons you might get better performance.<br>
                <br>
                google: postgis tiling large polygons<br>
                <br>
                if you need the distance that the line intersects the
                multiple tiles or multiple split multipolygons you will
                need to sum() and group on the original id of the split
                object.<br>
                <br>
                -Steve<br>
                <br>
                On 7/6/2013 1:10 PM, Evan Martin wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  It's not really "many large things vs many large
                  things". Most lines are<br>
                  < 100 km long (but there are some over 1000 km).
                  Here's a percentile<br>
                  chart: <a moz-do-not-send="true"
                    href="https://imageshack.us/a/img16/940/w5s.png"
                    target="_blank">https://imageshack.us/a/img16/940/w5s.png</a><br>
                  <br>
                  Most of the polygons are also quite small and simple,
                  but there are a<br>
                  few really large complex ones. From my testing it
                  looks like a few of<br>
                  the "worst" polygons (multi-polygons, actually) take
                  all the time, so<br>
                  that 25,000 count was a bit misleading. 96% of them
                  have < 100 points,<br>
                  but the worst one has > 23,000. I couldn't get the
                  area, because<br>
                  ST_Area(geog) is returning some ridiculously high
                  numbers, but it would<br>
                  be millions of sq km.<br>
                  <br>
                  On 06.07.2013 5:48, Paul Ramsey wrote:<br>
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    Without seeing your data it's quite hard to say.
                    Many large things vs<br>
                    many large things yields a problem where indexes and
                    so on don't have<br>
                    a lot of leverage on the problem.<br>
                    <br>
                    On Tue, Jul 2, 2013 at 6:39 AM, Evan Martin<br>
                    <<a moz-do-not-send="true"
                      href="mailto:postgresql@realityexists.net"
                      target="_blank">postgresql@realityexists.net</a>>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      Hi,<br>
                      <br>
                      I have tables of ~25,000 polygons and ~80,000
                      lines and I want to<br>
                      find which<br>
                      lines intersect which polygons using PostGIS 2.1.
                      Both are<br>
                      geographies and<br>
                      can span the dateline. Doing this the simple way
                      using<br>
                      ST_Intersects(geog,<br>
                      geog) takes about 3 hours on my machine and I'd to
                      see if there's a<br>
                      way to<br>
                      speed this up.<br>
                      <br>
                      I already have indexes on the geography columns
                      and one of them is being<br>
                      used (the one on the lines). Each line only has 2
                      points, but the<br>
                      polygons<br>
                      have anywhere from 4 to 20,000 points and some of
                      them are very<br>
                      large. It<br>
                      would be OK to miss some of the smaller
                      intersections (ie. where the two<br>
                      only just barely intersect), but I wouldn't want
                      the query to return<br>
                      false<br>
                      positives. In fact, ideally, I'd like to find only
                      the lines that<br>
                      "substantially" intersect a polygon, eg. at least
                      x km or x% of the<br>
                      line is<br>
                      in the polygon, but finding any intersections at
                      all would be a start.<br>
                      <br>
                      One trick I tried is ST_SimplifyPreserveTopology.
                      I used that to create<br>
                      simplified version of the polygons (at least those
                      that don't span the<br>
                      dateline) and check those first, then if they
                      intersect then check<br>
                      the real<br>
                      polygons. This seems to work, but the performance
                      gains are marginal<br>
                      compared to the simple approach.<br>
                      <br>
                      Is there another trick I can use to do this
                      faster? I know<br>
                      ST_Intersects()<br>
                      internally calls ST_Distance(), which calculates
                      the distance to a<br>
                      fraction<br>
                      of a metre. I don't need that kind of precision,
                      so surely there's some<br>
                      "shorcut" to be found?<br>
                      <br>
                      Thanks,<br>
                      <br>
                      Evan<br>
                    </blockquote>
                  </blockquote>
                </blockquote>
              </blockquote>
            </blockquote>
            <br>
            _______________________________________________<br>
            postgis-users mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:postgis-users@lists.osgeo.org"
              target="_blank">postgis-users@lists.osgeo.org</a><br>
            <a moz-do-not-send="true"
              href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users"
              target="_blank">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a><br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
postgis-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a></pre>
    </blockquote>
    <br>
  </body>
</html>