<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Glad you got it working. <br>
<br>
Just a little note of warning ... I noticed you are using "UNION",
not "UNION ALL". The difference is that the later will perform a
simple concatenation of the two results sets. The former is kind of
like a set union, returning results that are in the first or second
query, removing duplicate rows. In your 8 second query, not a big
deal, but in your 3 second query, you've include a geometry object.
In determining record equivalence to filter duplicate rows, it is
the = operator that is invoked against two geometry objects, which
goes back to comparing bounding box equivalence, not geometry
equivalence.<br>
<br>
i.e.<br>
<tt>SELECT 'POINT(0 0)'::geometry<br>
UNION<br>
SELECT 'POINT(0 0.0000001)'::geometry;<br>
geometry <br>
--------------------------------------------<br>
010100000000000000000000000000000000000000<br>
(1 row)<br>
<br>
<br>
SELECT 'POINT(0 0)'::geometry<br>
UNION ALL<br>
SELECT 'POINT(0 0.0000001)'::geometry;<br>
geometry <br>
--------------------------------------------<br>
010100000000000000000000000000000000000000<br>
0101000000000000000000000048AFBC9AF2D77A3E<br>
(2 rows)</tt><br>
<br>
Not a big deal, but I just wanted to make sure you knew what was
happening with your query.<br>
UNION ALL should also be faster as it's not trying to merge two
results sets together.<br>
<br>
Cheers,<br>
Kevin<br>
<br>
<br>
On 11/30/2010 2:34 PM, Brian Stempin wrote:
<blockquote
cite="mid:AANLkTikEqMWgEhB_YUGEuHyyR5jy9smr8wHOHbBQpBpH@mail.gmail.com"
type="cite">
<div><br>
</div>
So, I decided to run with the following query:
<div><br>
</div>
<div>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<span class="Apple-style-span" style="font-family: monospace;
font-size: 10px; border-collapse: collapse;">SELECT osm_id <br>
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1<br>
WHERE osm_id NOT IN (<br>
<br>
-- list all ids where the startpoint intersects something.<br>
SELECT t1.osm_id<br>
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,<br>
"OSMData".osm_mn_data_highway_20101129_101234 t2<br>
WHERE t1.osm_id <> t2.osm_id<br>
AND ST_Intersects(ST_StartPoint(t1.way), t2.way)<br>
<br>
)</span></div>
<div><span class="Apple-style-span" style="font-family: monospace;
font-size: 10px; border-collapse: collapse;">UNION</span></div>
<div><span class="Apple-style-span" style="font-family: monospace;
font-size: 10px; border-collapse: collapse;">
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
SELECT osm_id <br>
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1<br>
WHERE osm_id NOT IN (<br>
<br>
-- list all ids where the startpoint intersects something.<br>
SELECT t1.osm_id<br>
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,<br>
"OSMData".osm_mn_data_highway_20101129_101234 t2<br>
WHERE t1.osm_id <> t2.osm_id<br>
AND ST_Intersects(ST_EndPoint(t1.way), t2.way)<br>
<br>
)</span></div>
<div><span class="Apple-style-span" style="font-family: monospace;
font-size: 10px; border-collapse: collapse;"><br>
</span></div>
In my dataset, this takes ~ 8 seconds to run. Being a bit of a
performance junkee, I modified it to the following:
<div><br>
<div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> SELECT osm_id, ST_StartPoint(way)</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;">FROM
"OSMData".osm_mn_data_highway_20101129_101234 t1</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> WHERE osm_id NOT IN (</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"><br>
</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> -- list all ids where the startpoint
intersects something.</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> SELECT t1.osm_id</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> FROM
"OSMData".osm_mn_data_highway_20101129_101234 t1,</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;">
"OSMData".osm_mn_data_highway_20101129_101234 t2</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> WHERE t1.osm_id <> t2.osm_id</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> AND t2.way ~ t1.way</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> AND
ST_Intersects(ST_StartPoint(t1.way), t2.way)</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"><br>
</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> )</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;">UNION</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;">SELECT osm_id, ST_EndPoint(way)</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;">FROM
"OSMData".osm_mn_data_highway_20101129_101234 t1</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> WHERE osm_id NOT IN (</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"><br>
</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> -- list all ids where the startpoint
intersects something.</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> SELECT t1.osm_id</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> FROM
"OSMData".osm_mn_data_highway_20101129_101234 t1,</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;">
"OSMData".osm_mn_data_highway_20101129_101234 t2</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> WHERE t1.osm_id <> t2.osm_id</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> AND t2.way ~ t1.way</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> AND
ST_Intersects(ST_EndPoint(t1.way), t2.way)</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"><br>
</div>
<div style="border-collapse: collapse; font-family: monospace;
font-size: x-small;"> )</div>
<br>
Note the addition of the "t2.way ~ t1.way" bit. "A ~ B" means
"A's bounding box contains B's bounding box." Since in my
case B is a point, I eliminate a lot of comparisons by only
looking at shapes who's bounding box B is contained in. This
reduced my query time from 8.x seconds to 3.4x seconds.</div>
</div>
<div><br>
</div>
<div>Thanks a ton, Kevin!</div>
<div><br>
</div>
<div>Brian</div>
<div>PS -- anyone else spot any other improvements that I can
make? I plan on writing about this in a blog article as part of
a larger piece later tonight/tomorrow.</div>
<pre wrap=""><fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
postgis-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@postgis.refractions.net">postgis-users@postgis.refractions.net</a>
<a class="moz-txt-link-freetext" href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.refractions.net/mailman/listinfo/postgis-users</a>
</pre>
</blockquote>
</body>
</html>