[postgis-users] How/where does postgis hook a callback to free cached geos structures?

Mark Cave-Ayland mark.cave-ayland at ilande.co.uk
Sun Apr 21 13:37:13 PDT 2013


On 21/04/13 20:03, Stephen Woodbridge wrote:

> Hi Mark,
>
> I'm trying to rewrite the wrappers for the pagc address standardizer
> such that I can create and cache the standardizer obj in a per query
> cache. I think the following code modeled after GetGeomCache will do
> what I need. The problem I'm having is that I need to somehow hook the
> query shutdown code with a callback that will allow me to free the
> standardizer.
>
>
> void FreeStdCache(StdCache * cache)
> {
> // free the cached objects
> }
>
> StdCache *GetStdCache(FunctionCallInfoData *fcinfo)
> {
> MemoryContext old_context;
> StdCache *cache = fcinfo->flinfo->fn_extra;
> if (! cache) {
> old_context = MemoryContextSwitchTo(fcinfo->flinfo->fn_mcxt);
> cache = palloc(sizeof(StdCache));
> MemoryContextSwitchTo(old_context);
> cache->std = std_init();
> fcinfo->flinfo->fn_extra = cache;
>
> // ########## not sure how to do the following #############
> ExprContext *econtext = ?????;
> RegisterExprContextCallback(econtext, FreeStdCache, cache);
> }
> return cache;
> }
>
> So my function is not an SRF. I would get called like:
>
> select * from standardize_address(
> 'lexicon', 'gazeteer', 'rules',
> '123 main st', 'boston ma 02001');
>
> as a single request where we would construct the standardizer and then
> free it. But in a query like the following, we would construct it, cache
> it for each record, and free is when query shutdowns.
>
> select (std).* from (
> select standardize_address('lexicon', 'gazeteer', 'rules', micro, macro)
> as std from table_to_standardize) as foo;
>
>
> I'm not sure if I can use RegisterExprContextCallback() to do this or of
> there is a better way. And not sure how to get econtext? I think that
> might only be available for SRF functions.
>
> I saw Mark's inquiry here:
>
> http://postgresql.1045698.n5.nabble.com/Any-advice-about-function-caching-td1936551.html
>
>
> but could not find the code that registers the callback in postgis.
>
> Here is a similar post:
>
> http://web.archiveorange.com/archive/v/alpsnw9p7b8CWMh7hBPj
>
> But neither have an example of how the issue was resolved. So a little
> help or pointer would be appreciated.
>
> Thanks,
> -Steve

Hi Steve,

The way I solved this in the end for PROJ.4 was to create my own type of 
PostgreSQL MemoryContext - search for PROJ4SRSCacheContextMethods in 
libpgcommon/lwgeom_transform.c.

PostgreSQL has its own hierarchical memory allocator, much like Samba's 
talloc(). What this means is that all memory allocations are stored in a 
tree structure using a handle called a MemoryContext. When PostgreSQL 
destroys a MemoryContext, it first descends the tree and destroys all of 
the child MemoryContexts before destroying itself. The advantage of this 
is that by destroying a top level MemoryContext such as a query-level 
MemoryContext, then you guarantee that all of the other child 
MemoryContext allocations are freed, and hence the problem of leaking 
memory mostly disappears.

A MemoryContext has its own set of routines that are called upon 
creation and deletion. So what I did was create a custom memory context 
that doesn't really do anything, except that it contains code to release 
all its resources (see PROJ4SRSCacheDelete) as part of its 
deconstructor. This MemoryContext is then attached as a child of the 
current MemoryContext. Hence when the current MemoryContext is finally 
deleted by PostgreSQL, the deconstructor for the child MemoryContext is 
called *first* which enables us to tidy up our outstanding external 
library references correctly before the cache information itself is 
destroyed.

 From memory, the PROJ.4 MemoryContext lives for the lifetime of a 
backend so you shouldn't see the destructor being called that often. If 
you want to use a similar trick for your standardizer, then take a look 
at the (disabled) GetPROJ4SRSCache code in the same file.

I believe that the fcinfo->flinfo->fn_mcxt MemoryContext for a 
PostgreSQL function has a lifetime for the duration of a single query 
(as it is used to store SRF-related state information). Therefore you 
should find that if you create your new MemoryContext as a child of that 
MemoryContext, you have something that not only lasts for the duration 
of a single query, but also behaves correctly in the case of error 
conditions such as aborting a query etc. Also note that the concept of 
the PostgreSQL SRF code (i.e. per-query state) is very similar to what 
you are trying to do here and so looking at that code is likely to 
provide a good source of inspiration.


HTH,

Mark.

P.S. If you are working on code which is dependent upon memory 
lifetimes, make sure that you build PostgreSQL with --enable-debug and 
--enable-cassert. This traps accidental accesses to already-freed memory 
and will save you a lot of time/head-scratching during development.


More information about the postgis-users mailing list