[postgis-users] PostGIS KNN best practices
    Stephen V. Mather 
    svm at clevelandmetroparks.com
       
    Thu May 17 07:42:45 PDT 2012
    
    
  
Cool,
                That approach is about twice as fast as my previous approach
for comparing points to lines, e.g.:
 
 
CREATE OR REPLACE FUNCTION angle_to_street (geometry) RETURNS double
precision AS $$
 
WITH index_query as
                (SELECT ST_Distance($1,road.geom) as dist,
                                degrees(ST_Azimuth($1, ST_ClosestPoint($1,
road.geom))) as azimuth
                FROM street_centerlines As road
                ORDER BY $1 <#> road.geom limit 5)
SELECT azimuth
                FROM index_query
                ORDER BY dist
LIMIT 1;
 
$$ LANGUAGE SQL;
 
DROP TABLE IF EXISTS address_points_rot;
CREATE TABLE address_points_rot AS
                SELECT addr.*, angle_to_street(addr.geom)
                FROM
                                address_points addr;
 
 
 
http://www.clemetparks.com/images/esig/cmp-ms-90x122.pngStephen Mather
Geographic Information Systems (GIS) Manager
(216) 635-3243
svm at clevelandmetroparks.com
 <http://www.clemetparks.com/> clevelandmetroparks.com
 
 
 
 
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
Alexandre Neto
Sent: Wednesday, May 16, 2012 7:35 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] PostGIS KNN best practices
 
I have been around that question to.
 
http://gis.stackexchange.com/questions/24456/nearest-neighbor-problem-in-pos
tgis-2-0-using-gist-index-function 
 
You have to do it in two steps, like is explained in the operator page
<http://postgis.refractions.net/docs/geometry_distance_centroid.html> . One
faster step to reduce the candidates (by using <-> or <#>) and second one to
get the real distances with ST_Distance.
 
The problem in finding the KNN for each row in a table is the fact that the
gist index <-> operator only works if one of the geometries is constant. The
workaround would be to create a SQL function to apply to each of the rows
using table.the_geom as a parameter.
 
Something like this:
 
----
CREATE OR REPLACE FUNCTION _enn2 (geometry) RETURNS double precision AS $$
 
WITH index_query as
(SELECT ST_Distance($1,f.the_geom) as dist
FROM "grelha5m" As f
ORDER BY $1 <#> g1.the_geom limit 1000)
SELECT dist
FROM index_query
ORDER BY dist;
 
$$ LANGUAGE SQL;
---
 
and I call it like this: 
 
---
Select c.gid as gid, _enn2(c.the_geom) as enn
>From cosn1 as c
Order by c.gid
---
 
In this case the function returned the smallest distance, but you can choose
any other column.
 
Hope it helps
 
Alexandre Neto
 
On Tue, May 15, 2012 at 5:37 PM, Stephen V. Mather
<svm at clevelandmetroparks.com> wrote:
Hi All,
                Pretty excited by the new operators <-> and <#>, but a bit
confused as to how to use them in a query.  The two examples from P. Ramsey
back in late 2011 (
http://blog.opengeo.org/2011/09/28/indexed-nearest-neighbour-search-in-postg
is/ ) included doing a KNN on a single point to a cloud of points, i.e. 
 
SELECT name, gid
FROM geonames
ORDER BY geom <-> st_setsrid(st_makepoint(-90,40),4326)
LIMIT 10;
 
or doing KNN on non-point different geometries, where the first neighbor by
<-> or <#> might not be truly the first i.e.
 
with index_query as (
  select
    st_distance(geom, 'SRID=3005;POINT(1011102 450541)') as distance,
    parcel_id, address
  from parcels
  order by geom <#> 'SRID=3005;POINT(1011102 450541)' limit 100
)
select * from index_query order by distance limit 10;
 
So, how would one grab the first nearest neighbor for all points in a
dataset?  This is how I used to do it:
 
CREATE TABLE n2180_560_height AS 
SELECT x, y, height FROM 
(SELECT DISTINCT ON(veg.gid) veg.gid as gid, ground.gid as gid_ground, veg.x
as x, veg.y as y, ground.z as z, veg.z - ground.z as height, veg.the_geom as
geometry, veg.class as class
FROM (SELECT * FROM n2180_560 WHERE class = 5) As veg, (SELECT * FROM
n2180_560 WHERE class = 2) As ground
WHERE veg.class = 5 AND veg.gid <> ground.gid AND ST_DWithin(veg.the_geom,
ground.the_geom, 10)
ORDER BY veg.gid, ST_Distance(veg.the_geom,ground.the_geom)) AS vegpoints;
 
ST_DWithin prevents a full cross join, but is a sloppy way to do this, as it
requires a priori knowledge of the end cases.  I'm hoping there's a subquery
or some such magic that would allow me to use the distance operator to a
similar end. .
 
Thanks,
Best,
Steve
 
http://www.clemetparks.com/images/esig/cmp-ms-90x122.pngStephen Mather
Geographic Information Systems (GIS) Manager
(216) <tel:%28216%29%20635-3243>  635-3243
svm at clevelandmetroparks.com
 <http://www.clemetparks.com/> clevelandmetroparks.com
 
 
 
 
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20120517/e7d3ae82/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 3772 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20120517/e7d3ae82/attachment.png>
    
    
More information about the postgis-users
mailing list