<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    Glad you got it working.  <br>
    <br>
    Just a little note of warning ... I noticed you are using "UNION",
    not "UNION ALL".  The difference is that the later will perform a
    simple concatenation of the two results sets.  The former is kind of
    like a set union, returning results that are in the first or second
    query, removing duplicate rows.  In your 8 second query, not a big
    deal, but in your 3 second query, you've include a geometry object. 
    In determining record equivalence to filter duplicate rows, it is
    the = operator that is invoked against two geometry objects, which
    goes back to comparing bounding box equivalence, not geometry
    equivalence.<br>
    <br>
    i.e.<br>
    <tt>SELECT 'POINT(0 0)'::geometry<br>
      UNION<br>
      SELECT 'POINT(0 0.0000001)'::geometry;<br>
                        geometry                  <br>
      --------------------------------------------<br>
       010100000000000000000000000000000000000000<br>
      (1 row)<br>
      <br>
      <br>
      SELECT 'POINT(0 0)'::geometry<br>
      UNION ALL<br>
      SELECT 'POINT(0 0.0000001)'::geometry;<br>
                        geometry                  <br>
      --------------------------------------------<br>
       010100000000000000000000000000000000000000<br>
       0101000000000000000000000048AFBC9AF2D77A3E<br>
      (2 rows)</tt><br>
    <br>
    Not a big deal, but I just wanted to make sure you knew what was
    happening with your query.<br>
    UNION ALL should also be faster as it's not trying to merge two
    results sets together.<br>
    <br>
    Cheers,<br>
    Kevin<br>
    <br>
    <br>
    On 11/30/2010 2:34 PM, Brian Stempin wrote:
    <blockquote
      cite="mid:AANLkTikEqMWgEhB_YUGEuHyyR5jy9smr8wHOHbBQpBpH@mail.gmail.com"
      type="cite">
      <div><br>
      </div>
      So, I decided to run with the following query:
      <div><br>
      </div>
      <div>
        <meta http-equiv="content-type" content="text/html;
          charset=ISO-8859-1">
        <span class="Apple-style-span" style="font-family: monospace;
          font-size: 10px; border-collapse: collapse;">SELECT osm_id <br>
          FROM "OSMData".osm_mn_data_highway_20101129_101234 t1<br>
          WHERE osm_id NOT IN (<br>
          <br>
             -- list all ids where the startpoint intersects something.<br>
             SELECT t1.osm_id<br>
             FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,<br>
             "OSMData".osm_mn_data_highway_20101129_101234 t2<br>
             WHERE t1.osm_id <> t2.osm_id<br>
             AND ST_Intersects(ST_StartPoint(t1.way), t2.way)<br>
          <br>
          )</span></div>
      <div><span class="Apple-style-span" style="font-family: monospace;
          font-size: 10px; border-collapse: collapse;">UNION</span></div>
      <div><span class="Apple-style-span" style="font-family: monospace;
          font-size: 10px; border-collapse: collapse;">
          <meta http-equiv="content-type" content="text/html;
            charset=ISO-8859-1">
          SELECT osm_id <br>
          FROM "OSMData".osm_mn_data_highway_20101129_101234 t1<br>
          WHERE osm_id NOT IN (<br>
          <br>
             -- list all ids where the startpoint intersects something.<br>
             SELECT t1.osm_id<br>
             FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,<br>
             "OSMData".osm_mn_data_highway_20101129_101234 t2<br>
             WHERE t1.osm_id <> t2.osm_id<br>
             AND ST_Intersects(ST_EndPoint(t1.way), t2.way)<br>
          <br>
          )</span></div>
      <div><span class="Apple-style-span" style="font-family: monospace;
          font-size: 10px; border-collapse: collapse;"><br>
        </span></div>
      In my dataset, this takes ~ 8 seconds to run.  Being a bit of a
      performance junkee, I modified it to the following:
      <div><br>
        <div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"> SELECT osm_id, ST_StartPoint(way)</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">FROM
            "OSMData".osm_mn_data_highway_20101129_101234 t1</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"> WHERE osm_id NOT IN (</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"><br>
          </div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   -- list all ids where the startpoint
            intersects something.</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   SELECT t1.osm_id</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   FROM
            "OSMData".osm_mn_data_highway_20101129_101234 t1,</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">  
            "OSMData".osm_mn_data_highway_20101129_101234 t2</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">    WHERE t1.osm_id <> t2.osm_id</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   AND t2.way ~ t1.way</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">    AND
            ST_Intersects(ST_StartPoint(t1.way), t2.way)</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"><br>
          </div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"> )</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">UNION</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">SELECT osm_id, ST_EndPoint(way)</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">FROM
            "OSMData".osm_mn_data_highway_20101129_101234 t1</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"> WHERE osm_id NOT IN (</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"><br>
          </div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   -- list all ids where the startpoint
            intersects something.</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   SELECT t1.osm_id</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   FROM
            "OSMData".osm_mn_data_highway_20101129_101234 t1,</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">  
            "OSMData".osm_mn_data_highway_20101129_101234 t2</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">    WHERE t1.osm_id <> t2.osm_id</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">   AND t2.way ~ t1.way</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;">    AND
            ST_Intersects(ST_EndPoint(t1.way), t2.way)</div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"><br>
          </div>
          <div style="border-collapse: collapse; font-family: monospace;
            font-size: x-small;"> )</div>
          <br>
          Note the addition of the "t2.way ~ t1.way" bit.  "A ~ B" means
          "A's bounding box contains B's bounding box."  Since in my
          case B is a point, I eliminate a lot of comparisons by only
          looking at shapes who's bounding box B is contained in.  This
          reduced my query time from 8.x seconds to 3.4x seconds.</div>
      </div>
      <div><br>
      </div>
      <div>Thanks a ton, Kevin!</div>
      <div><br>
      </div>
      <div>Brian</div>
      <div>PS -- anyone else spot any other improvements that I can
        make?  I plan on writing about this in a blog article as part of
        a larger piece later tonight/tomorrow.</div>
      <pre wrap=""><fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
postgis-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@postgis.refractions.net">postgis-users@postgis.refractions.net</a>
<a class="moz-txt-link-freetext" href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.refractions.net/mailman/listinfo/postgis-users</a>
</pre>
    </blockquote>
  </body>
</html>