[postgis-devel] Cube Indexes

Mon Aug 14 09:23:39 PDT 2006

Thanks for the analysis Mark.  I figured there would be an overhead  
relative to the 2D indexes.  I was actually hoping it would be  
possible to set up the N-D indexes as an addition to, rather than a  
replacement of, the 2D indexes (since 95% of use cases only need 2D  
indexing, it seems a shame to toss them in favor of something 50%  
heavier).  If we capped the n-dimensionality at 4 (since that is the  
maximum dimensionality of PostGIS) would that 8-bytes of overhead be  
smaller?

P

On 14-Aug-06, at 4:12 AM, Mark Cave-Ayland wrote:

> On Fri, 2006-08-11 at 11:38 -0700, Paul Ramsey wrote:
>> What would it take to enable multi-dimensional indexes as an option
>> on postgis geometries?
>> contrib/cube already does it, can we?
>> P
>
>
> Hi Paul,
>
> As far as I can tell looking at the contrib/cube code, the way it  
> works
> is that instead of having a fixed size structure like BOX2DFLOAT4,  
> they
> have introduced an equivalent variable length structure called NDBOX
> (N-dimensional box) to store information in the index. While there are
> fewer operators in the contrib/cube opclass, operators such as overlap
> (&&) have been extended to use N dimensions with N-dimensional  
> distance
> being calculated using standard Pythagorean distances.
>
> To add this functionality into PostGIS you'd have to do something  
> along
> the following lines:
>
> 	* Create a new N-dimensional structure (BOXNDFLOAT4) to replace
> 	BOX2DFLOAT4
>
> 	* Alter existing operators to use N-dimensional maths; also
> 	think about routines such as Distance() etc.
>
> 	* Consider adding indexable operators for the 3rd dimension
> 	operators of in front / behind. I'm not sure about
> 	continually adding operators for more dimensions as the meaning
> 	tends to be lost as the number of dimensions increases, however
> 	contrib/cube supports the &&, @ and ~ operators for N
> 	dimensions.
>
> 	* Change the GiST index functions to use N-dimensional routines,
> 	in particular for union/split; consider the B-Tree index class
> 	for sorting order etc.
>
> 	* Alter the statistics collector to collect information about
> 	N-dimensions and the join/selectivity estimators to use this
> 	new information
>
>
> Using contrib/cube as a starting point would help, but it's still a  
> lot
> of work since just enhancing BOX2DFLOAT4 to larger dimensions would
> touch most places in the source tree. Also moving to a variable length
> structure will add an additional 8 bytes (two integers) to each index
> entry so the size of a standard 2D index entry would increase by 50%.
>
> In short: it could be done, but it would affect a large part of the  
> code
> tree and standard 2D indices would become 50% larger. As a consequence
> of changing a large part of the code tree, the testing during  
> coding is
> likely to take up a large proportion of the time, although the new
> regression tests should help with this somewhat.
>
>
> Kind regards,
>
> Mark.
>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel