[postgis-devel] Cube Indexes
Mark Cave-Ayland
mark.cave-ayland at ilande.co.uk
Mon Aug 14 04:12:50 PDT 2006
On Fri, 2006-08-11 at 11:38 -0700, Paul Ramsey wrote:
> What would it take to enable multi-dimensional indexes as an option
> on postgis geometries?
> contrib/cube already does it, can we?
> P
Hi Paul,
As far as I can tell looking at the contrib/cube code, the way it works
is that instead of having a fixed size structure like BOX2DFLOAT4, they
have introduced an equivalent variable length structure called NDBOX
(N-dimensional box) to store information in the index. While there are
fewer operators in the contrib/cube opclass, operators such as overlap
(&&) have been extended to use N dimensions with N-dimensional distance
being calculated using standard Pythagorean distances.
To add this functionality into PostGIS you'd have to do something along
the following lines:
* Create a new N-dimensional structure (BOXNDFLOAT4) to replace
BOX2DFLOAT4
* Alter existing operators to use N-dimensional maths; also
think about routines such as Distance() etc.
* Consider adding indexable operators for the 3rd dimension
operators of in front / behind. I'm not sure about
continually adding operators for more dimensions as the meaning
tends to be lost as the number of dimensions increases, however
contrib/cube supports the &&, @ and ~ operators for N
dimensions.
* Change the GiST index functions to use N-dimensional routines,
in particular for union/split; consider the B-Tree index class
for sorting order etc.
* Alter the statistics collector to collect information about
N-dimensions and the join/selectivity estimators to use this
new information
Using contrib/cube as a starting point would help, but it's still a lot
of work since just enhancing BOX2DFLOAT4 to larger dimensions would
touch most places in the source tree. Also moving to a variable length
structure will add an additional 8 bytes (two integers) to each index
entry so the size of a standard 2D index entry would increase by 50%.
In short: it could be done, but it would affect a large part of the code
tree and standard 2D indices would become 50% larger. As a consequence
of changing a large part of the code tree, the testing during coding is
likely to take up a large proportion of the time, although the new
regression tests should help with this somewhat.
Kind regards,
Mark.
More information about the postgis-devel
mailing list