<br>
<br><tt><font size=2>postgis-users-bounces@postgis.refractions.net wrote
on 07/02/2010 12:26:43 AM:<br>
<br>
> GeoHash doesn't cut it? It's space partitioned, just not as balanced<br>
> as a kd-tree, but for points... no? I'll look up my info on the GiST<br>
> work that was done, but note that it didn't make it into 9.0, but
into<br>
> 9.1, and as I recall Oleg and co did not seem to think the kd<br>
> implementation was going to be very effective.<br>
</font></tt>
<br><tt><font size=2>At this point I'm just considering solutions on the
conceptual level to identify promising candidates. GeoHash is promising.
However, my issue with GeoHash is that it linearizes a 2d problem, much
like space-filling curves. (i.e., you never know how far away a "close"
geohash-value is; conversely: identifying all the "near" geohashes
presents an issue). Besides: GeoHash is global and our study is regional.
2/3 or more of the hash values will be empty, while the others will be
chock full of points. Also, our region extends from 20-80 degrees N. GeoHashes
aren't a constant-distance metric, they're constant-angle. We're going
to have fat cells toward the south and skinny ones in the north.</font></tt>
<br>
<br><tt><font size=2>The most conceptually pleasing approach for this proximity-based
function would be to operate on the kd tree itself, minimizing the number
of searches. At most, a "proximity" search from a known point
in the tree may have to pop up a couple of levels (to check the neighbors),
and the _same_ proximity search could be re-used for all nodes under the
same subtree. Directly navigating a spatial index which retains the spatial
identity of the indexed points seems like the most efficient way to do
this. </font></tt>
<br>
<br><tt><font size=2>I guess in database terms, I'm talking about implementing
a new "cursor" which leverages a hierchical structure like the
kd-tree (to walk thru the points using the proximity information in the
tree). Simultaneously: implementing a caching proximity search on the tree
itself which can re-use previous searches from the same subtree. That way,
proximity searches from points under the same subtree needn't be repeated.</font></tt>
<br>
<br><tt><font size=2>This is perhaps not a topic for the users list. If
I flesh the idea out more, I'll use the wiki. We've been granted a reprieve
for the moment by selecting a "case study" which needs only 3
million out of 90 million points. However, we're eventually going to have
to suck it up and process everything. So we've got a bit of time to mull
it over; and/or review the literature for smart ways to use a kd-tree/other
scheme.</font></tt>
<br>
<br><tt><font size=2>Bryce </font></tt>