[postgis-devel] Transform() PROJ4 cache patch

Mark Cave-Ayland m.cave-ayland at webbased.co.uk
Fri Oct 28 00:00:28 PDT 2005


Hi everyone,

After looking at the thread here
(http://postgis.refractions.net/pipermail/postgis-users/2005-October/009771.
html) regarding the slow performance of the transform() function, I've had a
go at implementing a caching version of transform() which attempts to
eliminate both the overhead of looking up entries in the spatial_ref_sys
table and creating/destroying reprojection objects with pj_init() and
pj_free().

The key point with maintaining a cache is knowing when to initialise and
destroy the cache to ensure that any entries are invalidated after the
current query. After reading about the hierarchical memory contexts used by
PostgreSQL in src/backend/utils/mmgr/README, it occurred to me that it could
be possible to create a child memory context for MessageContext that would
receive notifications upon the start and end of each query which could make
it possible to implement a cache suitable for an IMMUTABLE function.

While implementing an entire MemoryContext implementation seemed a lot of
work, I decided to simply use the Init() and Delete() functions as
notifications, and store the cached items within each individual backend. 

Whenever transform() is called, the patch checks to see if our child memory
context exists; if it doesn't then the cache memory context is added as a
child to MessageContext and the cache initialised for this query. Both the
input (geometry) and output (constant) proj4texts are looked up in the cache
to see if they exist; if they do then it is possible to return a PJ *
pointer ready for use immediately. Otherwise the code uses SPI to lookup the
proj4text from spatial_ref_sys, calls make_project(), and adds the PJ *
pointer to the cache using the srid as the key.

When the current query has finished processing, the Delete() method of the
cache memory context is automatically called which first pj_free()s any
existing PJ * pointers followed by the cache itself before the context is
deleted.


Performance here with PostgreSQL 8.0/PostGIS 1.0.3 is very good:

BEFORE

infomapper_dev=# \timing
Timing is on.
infomapper_dev=# select count(*) from osgb_point;
 count
--------
 425588
(1 row)

Time: 723.873 ms
infomapper_dev=# select count(transform(geom, 4326)) from osgb_point;
 count
--------
 425582
(1 row)

Time: 115331.752 ms
infomapper_dev=# select count(transform_geometry(geom, '+proj=tmerc
+lat_0=49 +lon_0=-2 +k=0.999601 +x_0=400000 +y_0=-100000 +ellps=airy
+units=m +no_defs'::text, '+proj=longlat +ellps=WGS84 +datum=WGS84
+no_defs'::text, 4326)) from osgb_point;
 count
--------
 425582
(1 row)

Time: 30944.905 ms


AFTER TRANSFORM CACHE PATCH APPLIED

infomapper_dev=# select count(transform(geom, 4326)) from osgb_point;
 count
--------
 425582
(1 row)

Time: 5564.839 ms


I haven't applied this to CVS as this is my first effort delving into
MemoryContexts and I would consider it highly experimental. I would
therefore be grateful if people experiencing the slow transform() problem
could test and post any results back to the list.


Kind regards,

Mark.

------------------------
WebBased Ltd
17 Research Way
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023

http://www.webbased.co.uk   
http://www.infomapper.com 
http://www.swtc.co.uk  

This email and any attachments are confidential to the intended recipient
and may also be privileged. If you are not the intended recipient please
delete it from your system and notify the sender. You should not copy it or
use it for any purpose nor disclose or distribute its contents to any other
person.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: transform_cache_postgis-1.0.4.patch
Type: application/octet-stream
Size: 13361 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20051028/3f3b3a06/attachment.obj>


More information about the postgis-devel mailing list