[postgis-devel] GiST -- making my index faster makes is slower

Fri Apr 16 11:57:08 PDT 2004

I just tried an experiment that I thought would improve the performance 
of the PostGIS spatial index.

The old way used a BOX (bounding box of 4 doubles) as the key in the index.

The new way uses a bounding box of 4 single-precision floats (its only 
16 bytes long - a BOX is 32).  I thought that it would significantly 
reduce the size of the index (it did) and that the indexing would be 
faster since there's less disk pages to fiddle through and these pages 
would be more likely to fit in the disk cache.

It turns out this is not the case - its significantly slower.  I think 
it to do with more keys fitting in the leaf tuples.

As far as I can tell, the GiST index looks through the internal nodes, 
then when it hits a leaf node, it search through each of the keys in the 
leaf.

I save time searching through the internal nodes (because there's less 
of them) - this is O(log n) saving.

I lose time searching through the leafs (because there's more keys in a 
leaf) - this is O(n) cost.

For a query that does a nested loop of 10,000 index scans (on the same 
table), the old method takes about 2 second, the new "faster" method 
takes about 20 seconds.

I'm still trying to verify that this is the actual reason why its slower 
- anyone have any other ideas?  Anyone know a good way to profile this?

dave