[postgis-users] update / closest polygon

Fri Mar 3 14:35:33 PST 2006

--- Paul Ramsey <pramsey at refractions.net> wrote:

> "Nearest neighbor" is a whole wonderful special case to optimize, and  
> we have never actually done the general optimization work in  
> PostGIS.  That's because the nearest-neighbor "hack" works most of  
> the time.

This approach worked OK for me a while back where <arbitrary distance> was
calculated on the fly from the data using a subquery. (I don't have access to
Postgis here, so I'm running from memory, but it was along these lines).

The search extent was set by getting the min dist of the first 1000 records, so
once that has been found, which was reasonably quick in my case, the &&
(spatial index) is filtering out around 99.9% of the records, so the distance
is only being calculated & ordered for 0.1%. I had a few million points, & 1000
seemed to give a large enough sample to get an effective filter. You can easily
play with this number to get the best tradeoff between to large & too small a
value.

 select a.id,
        a.distance 
 from a,
      b
 where a && expand(b, select min(select distance(a.geom,b.geom) 
                                 from a,b
                                 <?where?>
                                 limit 1000
                                 )
                   )
 <?where?>
 order by distance limit 1;

I was also going to try a pre-built lookup table :

insert into t_dist_lookup
select a.id, 
       b.id,
       distance(a,b)
from a,
     b;

Then index the columns, clustered on distance, & use a lookup on this table of
pre-calculated distances when needed. I never got around to trying it though.
The other one worked well enough at the time.

Brent Wood