[postgis-devel] Prepared Geometry API

Mon Oct 6 10:29:46 PDT 2008

Here's the cache struct:

typedef struct
{
        int32                         key1;
        int32                         key2;
        int32                         argnum;
        const GEOSPreparedGeometry*   prepared_geom;
        const GEOSGeometry*           geom;
        MemoryContext                 context;
} PrepGeomCache;

Our "cache" actually consists of only one prepared geometry. We keep a
reference to the original geometry around too, so we can clean it up
properly with the prepared geometry.

We have two keys now, one for each argument. Though everything except
Intersects uses just one key. The argnum just tells us which argument
we are caching, the first or second (this allows us to deal with the
intersects case where the first argument is rapidly changing and the
second is static).

Each time _ST_IntersectsPrepared is called, we pull out the cache
object, and check cache key1 against function key1 and cache key2
against function key2. If key1 is the same as last time, we use the
prepared geometry we have handy. If not, (and key2 is also different)
we tear down our existing cache geometries and re-set the cache
information.

> 1) So you have a cache already or is that if you were to go with this ID
> thing? I thought you already have prepared working so we have caching
> happing already no?
> 2) If you have a cache already, are we caching both the left argument and
> the right argument now or just one?
> 3) You pass the id in the first time it sees the id, you cache the geometry
> using the id as a key to pull it out later.
> 4) Second time you see the id, you just pull it from the cache? (I still
> think you need an index by the way Mark to make this efficient for large
> numbers of large geoms to efficiently pull out of the cache, but that's
> besides the point I guess)
> 5) You are making Intersects and contains and all that other stuff order
> dependent (e.g. now I have to pass in the big geometry first or second and
> the id of the big geometry)?

1) we have a cache. we are caching prepared GEOS geometries
2) we are caching the geometry we will need to handle the next
function invocation, in the case that the next invocation uses the
same geometry as the previous one
3) we only have one entry in our "cache" the id it not for retrieval,
it's to tell us if the geometry coming in is the same as the one we
are holding, without a more expensive test of equality
4) only one entry in the cache. the expense is in figuring out "is
this new geometry the same as the one I have cached" and Mark's
proposition is that memcmp will be fast enough, while my proposition
is that it will always be slower than an id check.
5) contains, within, covers, coveredby are all already order
dependent. intersects is not, and I have handled that case now. the
original code did have an assumption of order in intersects.

Paul