[postgis-devel] Re: [postgis-users] GIST index speed

Tom Lane tgl at sss.pgh.pa.us
Thu Jun 19 10:20:30 PDT 2008


Mark Cave-Ayland <mark.cave-ayland at siriusit.co.uk> writes:
> Okay. The reason I was thinking about Datum was because I was thinking 
> about trying to cache the results of the PostGIS -> GEOS transformation 
> for non-TOASTed as well as TOASTed objects. So effectively we'd need to 
> keep both the PostgreSQL geometry type and the GEOS-converted PostgreSQL 
> geometry type around where we can access them. With your suggestion, I'm 
> guessing that we would only be able to cache the PostGIS -> GEOS 
> transformation for TOASTed objects. Maybe that would be enough, although 
> it's one of those things that would be difficult to test without an 
> actual implementation :(

Yeah.  I don't really see any way to cache stuff for datums that aren't
toasted out-of-line, because there's no unique identifier assigned to
them that you could use for a cache lookup key.  On the other hand,
if the geometry isn't big enough to get toasted, it can't be all that
expensive to convert, so maybe such a solution is Good Enough.

Seems like you might also be able to solve it internally to PostGIS,
if you were to add some sort of reasonably unique identifier to the
internal representation of a geometry datum --- md5 of the rest of
the contents, say.  Then you could use that value as a cache key.

> Note that because geometries contain floating point numbers and are 
> multi-dimensional, the only real way to perform the join is using a 
> Nested Loop. Most of the slow queries I see are of this form; so it may 
> be that a cache simply for the duration of a single query would be 
> sufficient for the majority of cases.

Hmm.  What would happen with the TupleTableSlot implementation is that
a geometry coming from the outer side of the nestloop would get
detoasted only once, while those on the inner side would get detoasted
on each visit.  A backend-wide cache would do better, but only if it
were big enough to hold all the detoasted values from the inner relation
--- otherwise the values would fall out of cache before being visited
again, and you'd be no better off.

			regards, tom lane



More information about the postgis-devel mailing list