[postgis-users] PostGIS 2.0 Wishlist

Tue Apr 24 10:36:25 PDT 2007

On 23 avr. 07, at 21:32, Paul Ramsey wrote:

> FYI, I'm working on the overall wishlist document at this URL:
>
> http://docs.google.com/Doc?id=dg99qr76_2dgt26j
> --  
>
>   Paul Ramsey
>   Refractions Research
>   http://www.refractions.net
>   pramsey at refractions.net
>   Phone: 250-383-3022
>   Cell: 250-885-0632

		Hello,

	I saw that you plan a nearest neighbor implementation for PostGIS  
2.0. Last year, two students worked for me on that subject, i've been  
thinking to give you the result (my company owns the copyright, and  
we don't plan to do any business with this stuff), but i wanted to do  
a cleanup, checking the code and do some improvement before posting  
it. But i didn't take the time, and be sure that i'm sorry.

	I've been talking about that last month on IRC, to see if someone  
were interested, someone answered positive, but i'm on a rush a work  
and i did not worried about that anymore.

	Anyway, the algorithm is the Hjaltason & Samet one [1], and as far a  
i tested it, it was pretty good. But the current implementation has  
many problems :
  - it's all commented in french, and even some parts of code are in  
french (damned students)
  - the implementation works only with BBOXes, and not the actual  
geometry. AFAIK it could be possible to use the distance() function,  
but it needs to plays with datatypes to find the accurate  
implementation.
  - the function is not integrated to postrgesql correctly, you  
cannot use it in query, only in one call, where you give tables  
names, etc as parameters and even some attributes name might be hard  
coded. Nothing critical, but dirty job to be done.
   - it has been written somewhat quickly, by novice programmers.

But, the good news are :
  - there has been a worry of quality, since the student took care to  
write it in VC1 by himself.
  - it has worked correctly without visible bug on a 100 000 rows  
table in a small amount of time (i don't remember the exact values of  
the benchmark).

For the ones who are interested, the algorithm used is a kNN one (k  
Nearest Neighbors), so that you can get one or several results,  
ordered by distance.
To sum up quickly, the general algorithm plays with the R-Tree, it  
explores it in depth (choosing the correct branches), and fill in an  
array while going back up. When the R-Tree structure has been  
explored enough to ensure that the returned list is effectively the  
kNN ones (obvious, but an R-Tree is a quite tricky structure, so, not  
so easy to ensure :) ) the algorithm finishes.

So, I don't know if anyone is really interested with that piece of  
code, but, just in case, because it may avoid some hours of hard  
work. Anyway if someone wants to use it, it's pretty sure that i'll  
give some help. I didn't do that by myself, but i'd be happy to work  
with someone.

Nicolas.

[1] not sure, but i think that it is this one : http:// 
citeseer.ist.psu.edu/hjaltason98incremental.html