<br>

<br><tt><font size=2>postgis-users-bounces@postgis.refractions.net wrote

on 07/02/2010 12:26:43 AM:<br>

<br>

> GeoHash doesn't cut it? It's space partitioned, just not as balanced<br>

> as a kd-tree, but for points... no? I'll look up my info on the GiST<br>

> work that was done, but note that it didn't make it into 9.0, but

into<br>

> 9.1, and as I recall Oleg and co did not seem to think the kd<br>

> implementation was going to be very effective.<br>

</font></tt>

<br><tt><font size=2>At this point I'm just considering solutions on the

conceptual level to identify promising candidates. GeoHash is promising.

However, my issue with GeoHash is that it linearizes a 2d problem, much

like space-filling curves. (i.e., you never know how far away a "close"

geohash-value is; conversely: identifying all the "near" geohashes

presents an issue). Besides: GeoHash is global and our study is regional.

2/3 or more of the hash values will be empty, while the others will be

chock full of points. Also, our region extends from 20-80 degrees N. GeoHashes

aren't a constant-distance metric, they're constant-angle. We're going

to have fat cells toward the south and skinny ones in the north.</font></tt>

<br>

<br><tt><font size=2>The most conceptually pleasing approach for this proximity-based

function would be to operate on the kd tree itself, minimizing the number

of searches. At most, a "proximity" search from a known point

in the tree may have to pop up a couple of levels (to check the neighbors),

and the _same_ proximity search could be re-used for all nodes under the

same subtree. Directly navigating a spatial index which retains the spatial

identity of the indexed points seems like the most efficient way to do

this. </font></tt>

<br>

<br><tt><font size=2>I guess in database terms, I'm talking about implementing

a new "cursor" which leverages a hierchical structure like the

kd-tree (to walk thru the points using the proximity information in the

tree). Simultaneously: implementing a caching proximity search on the tree

itself which can re-use previous searches from the same subtree. That way,

proximity searches from points under the same subtree needn't be repeated.</font></tt>

<br>

<br><tt><font size=2>This is perhaps not a topic for the users list. If

I flesh the idea out more, I'll use the wiki. We've been granted a reprieve

for the moment by selecting a "case study" which needs only 3

million out of 90 million points. However, we're eventually going to have

to suck it up and process everything. So we've got a bit of time to mull

it over; and/or review the literature for smart ways to use a kd-tree/other

scheme.</font></tt>

<br>

<br><tt><font size=2>Bryce </font></tt>