[postgis-users] How to Lookup Nearest Locations from a Table

Shlomi Fish shlomif at iglu.org.il
Mon Dec 20 07:18:27 PST 2010


Hi Micha,

thanks for your E-mail and the attached code. Also thanks for giving the talk 
about the GIS software to the Tel Aviv Open Source club in the past.

On Monday 20 December 2010 00:20:41 Micha Silver wrote:
> On 12/16/2010 06:05 PM, Shlomi Fish wrote:
> > Hi all.
> > 
> > We have a table with a geography column. We'd like, for any given row, to
> > return the N nearest *other* rows.  We saw ST_Distance, but is that the
> > only approach? It feels like it will be slow and will not scale well.
> > 
> > Is there anything more magical to use? We're pretty new to PostGIS.
> 
> Hi Shlomi:
> You question piqued my curiosity so I did some testing.
> First off, If you don't know anything about the data (i.e. density of
> features) and you have to rely just on ST_Distance, like any problem
> with complexity O(n2) it will be slow.

Yes. I suppose a single query will be O(N), but the overall complexity of 
doing N points would be O(N^2).

> 
> *If* you know a maximum distance within which you can always find N
> other features, then, as a previous poster mentioned, the ST_DWithin
> function speeds things up considerably. (It uses a quick bounding box
> check in advance of the actual intersection).

OK.

> 
> But, Suppose you want to take advantage of DWithin, without knowing the
> maximum distance? How about passing a starting distance, and checking if
> N features are found. If not then double the search distance, and loop
> along like this till you find the maximum distance needed. Once found,
> run the query using DWithin with this new max distance to find those
> closest features.
> 
> Attached is a set of functions I put together to test this idea. I made
> a scratch point layer with 500000 points, then did "explain analyze" on
> the three functions implementing each of the three above options. Here
> are the results:
> 
> 1) First, just find 10 closed points to point # 500 using ST_Distance:
> geodata=> explain analyze select * from distances_naive(10,500);
>                                                         QUERY PLAN
> ---------------------------------------------------------------------------
> ---------------------------------------------- Function Scan on
> distances_naive  (cost=0.00..260.00 rows=1000
> width=16) (actual time=766.631..766.642 rows=10 loops=1)
>   Total runtime: 766.682 ms
> (2 rows)
> 
> 2) Now assuming I know a maximum distance (1000 meters in this case)
> within which I can always find 10 points:
> geodata=> explain analyze select * from distances_within(10,500,1000);
>                                                          QUERY PLAN
> ---------------------------------------------------------------------------
> ----------------------------------------------- Function Scan on
> distances_within  (cost=0.00..260.00 rows=1000
> width=16) (actual time=115.386..115.397 rows=10 loops=1)
>   Total runtime: 115.437 ms
> (2 rows)
> Much faster!
> 
> 
> 3) Next the function which doubles a starting search distance until the
> N features are found. I tested with a starting distance of 100 m:
> geodata=> explain analyze select * from distances_increment(10,500,100);
>                                                           QUERY PLAN
> ---------------------------------------------------------------------------
> -------------------------------------------------- Function Scan on
> distances_increment  (cost=0.00..260.00 rows=1000 width=16) (actual
> time=257.399..257.410 rows=10 loops=1)
>   Total runtime: 257.451 ms
> (2 rows)
> Not as fast as the straight-forward DWIthin, but still 1/3 the time of
> the "naive" ST_Distance only function!
> 

Hmm.... we may opt to make use of it. Thanks!

Regards,

	Shlomi Fish

> Best Regards,
> Micha
> 

-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Best Introductory Programming Language - http://shlom.in/intro-lang

Chuck Norris can make the statement "This statement is false" a true one.

Please reply to list if it's a mailing list post - http://shlom.in/reply .



More information about the postgis-users mailing list