[postgis-users] High Concurrency R* in GiST?

Mon Dec 5 13:09:45 PST 2011

On Mon, Dec 5, 2011 at 11:05 AM, C. Mundi <cmundi at gmail.com> wrote:

> I get the impression that GiST hides a lot of
> implementation details.  So I am hungry for details which will help me
> assess postGIS/postgreSQL for my application.

This is the key point, and it is so: the physical implementation
details are hidden behind the GiST API. As a result the R-Tree
implementation is a "standard" one, not an R* (though the split method
in Ang/Tan not Guttman). And as a result you can't do things like
rebalance the tree as specified in the R* recipe. The GiST API really
is quite narrow. You have the consistent function to control reads and
the compress/picksplit controlling writes.

So if you're looking for optimal tree construction you've come to the
wrong place. The primary benefit of the PostGIS indexing system is not
it's optimal nature but its existence: it's already here, you can
insert and query data with simple SQL, it does do locking and
consistent operations thanks to the postgresql infrastructure wrapped
around it.

As an architect my recommendation would be: since the development
overhead in building your system from scratch will be quite high,
investing the time into a load test on PostGIS first could save you a
lot of time if it turns out that even our imperfect system is actually
good enough to meet your needs.

Best,

P.