[postgis-users] Find n Nearest Neighbors for given Point using PostGIS?

Stephen Woodbridge woodbri at swoodbridge.com
Thu Feb 24 20:52:08 PST 2011


Scholle,

The only way you can do nearest neighbor searches the are fast is to 
write a stored procedure that expands the search if you fail to get the 
number of  results you want. So in pseudo code something like:


radius := 0.01; -- assuming degrees
loop
   select into cnt count(*) from mytable
    where expand(mypnt, radius) && the_geom limit 5;
   if found and cnt = 5 or radius > maxradius then
     for rr in select * from mytable
                where expand(mypnt, radius) && the_geom limit 5
       loop
         return rr;
       end loop;
     return;
   else
     radius := radius * 2;
   end if;
end loop;

So make a set returning function with the body something like this and 
you should get good performance. Because postgresql, does a really good 
job of caching pages and query results you will not pay much of a 
penalty for the repeated queries.

-Steve

On 2/24/2011 11:04 PM, Scholle wrote:
>
> I am trying to solve the problem of finding the n nearest neighbors using
> PostGIS:
>
> Starting Point:
>
>   - Table geoname with geonames (from geonames.org) containing
> latitude/longitude (WSG-84)
>   - Added a GeometryColumn geom with srid=4326 and datatype=POINT
>   - Filled geom with values: UPDATE geoname SET geom =
> ST_SetSRID(ST_Point(longitude,latitude) 4326);
>   - Created GIST index for geom (CREATE INDEX geom_index ON geoname USING
> GIST (geom);) / Clustered geom_index: CLUSTER geom_index ON geoname;)
>   - Created PRIMARY KEY UNIQUE BTREE index for geonameid
>
> Problem:
> Find n (e.g. 5) nearest neighbors for a given Point in table geoname
> represented by id (geoname.geonameid.
>
> Possible solution:
>
> Inspired by
> http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgis_nearest_neighbor,
> I tried the following query:
>
> "SELECT start.asciiname, ende.asciiname, distance_sphere(start.geom,
> ende.geom) as distance " +
> "FROM geoname As start, geoname As ende WHERE start.geonameid = 2950159 AND
> start.geonameid<>  ende.geonameid " +
> "AND ST_DWithin(start.geom, ende.geom, 300) order by distance limit 5"
>
> Processing time: about 60s
>
> Also tried an approach based on EXPAND:
>
> "SELECT start.asciiname, ende.asciiname, distance_sphere(start.geom,
> ende.geom) as distance " +
> "FROM geoname As start, geoname As ende WHERE start.geonameid = 2950159 AND
> start.geonameid<>  ende.geonameid AND expand(start.geom, 300)&&  ende.geom "
> +
> "order by distance limit 5"

The problem here is that you are expanding your search by 300 degrees if 
your data is in WGS84.

> Processing time: about 120s
>
> The intended application is some kind of autocomplete. So, any approach
> taking longer than<1s is not applicable. Is it generally possible to
> achieve such a response time with PostGIS?




More information about the postgis-users mailing list