[postgis-users] Query to select dangling line segments

Brian Stempin brian.stempin at gmail.com
Tue Nov 30 14:34:45 PST 2010


So, I decided to run with the following query:

SELECT osm_id
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND ST_Intersects(ST_StartPoint(t1.way), t2.way)

)
UNION
SELECT osm_id
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND ST_Intersects(ST_EndPoint(t1.way), t2.way)

)

In my dataset, this takes ~ 8 seconds to run.  Being a bit of a performance
junkee, I modified it to the following:

SELECT osm_id, ST_StartPoint(way)
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND t2.way ~ t1.way
   AND ST_Intersects(ST_StartPoint(t1.way), t2.way)

)
UNION
SELECT osm_id, ST_EndPoint(way)
FROM "OSMData".osm_mn_data_highway_20101129_101234 t1
WHERE osm_id NOT IN (

   -- list all ids where the startpoint intersects something.
   SELECT t1.osm_id
   FROM "OSMData".osm_mn_data_highway_20101129_101234 t1,
   "OSMData".osm_mn_data_highway_20101129_101234 t2
   WHERE t1.osm_id <> t2.osm_id
   AND t2.way ~ t1.way
   AND ST_Intersects(ST_EndPoint(t1.way), t2.way)

)

Note the addition of the "t2.way ~ t1.way" bit.  "A ~ B" means "A's bounding
box contains B's bounding box."  Since in my case B is a point, I eliminate
a lot of comparisons by only looking at shapes who's bounding box B is
contained in.  This reduced my query time from 8.x seconds to 3.4x seconds.

Thanks a ton, Kevin!

Brian
PS -- anyone else spot any other improvements that I can make?  I plan on
writing about this in a blog article as part of a larger piece later
tonight/tomorrow.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20101130/4da3ef3d/attachment.html>


More information about the postgis-users mailing list