[postgis-devel] Alignment Revision

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Wed Jan 21 03:37:53 PST 2009


strk wrote:

>> In order to get the LWGEOM in/out of PostgreSQL we need to 
>> serialize/deserialize the geometry. So what we are handed by PostgreSQL 
>> is actually a *SERIALIZED* LWGEOM and so we must deserialize it into a 
>> LWGEOM structure that we can actually do something useful with.
> 
> Correct. We deserialize for covenience, but do not copy actual vertex
> data (doubles).

Excellent - so I understood this part ;)

>> The most 
>> interesting part about this is that during the opposite process of 
>> serialization, we have to iterate through any point arrays within a 
>> LWGEOM anyway in order to memcpy() them into serialized form ready to 
>> return to PostgreSQL.
> 
> We don't memcpy each value here (at least we do not need to) but
> rather memcpy the whole pointarray data. No copying is required
> if the function implements a read-only operation.

Yup. Only functions that return (modified) geometries require the 
overhead of serialization.

>> Copying the minimal amount of information from the head of a serialized 
>> LWGEOM to a LWGEOM will be extremely quick, and so I'm not worried about 
>> this. Therefore the key is to make the coordinate arrays double-aligned 
>> within the PostgreSQL Datum so that during the deserialization process, 
>> we can just point the POINTARRAY straight into the Datum - which is what 
>> already happens.
> 
> To rephrase, the key is to make the on-disk coordinate arrays
> double-aligned so that during inspection we can just cast every
> ordinate pointer to a double w/out incurring in either a performance
> penalty (x86) or misread (on architectures that do not allow reading
> a double from a non-double aligned memory address).
> Correct ?

Yes. Please see my original post and test harness here: 
http://lists.refractions.net/pipermail/postgis-devel/2009-January/004473.html. 
  Even with a memcpy() from unaligned to aligned space, the speed 
difference is quite impressive.

>> Hence all we have to do is:
>>
>> 	- Ensure the double arrays are aligned within the Datum
>> 	- Rewrite any accessor methods to iterate through the
>> 	array pointer directly, rather than through
>> 	getPoint_internal
> 
> You mean ratehr then through getPoint_p here, as getPoint_internal
> is what returns a pointer into the POINTARRAY, which can be or not
> aligned depending on where the POINTARRAY comes from.
> If you allocated the POINTARRAY yourself it will be aligned, if
> it cams from the PostgreSQL Datum, it may or not be depending
> on how it was written.

Yes.

> This is an interesting thread as I proceed defining memory structures
> to use for holding rasters as in that case we'll have different
> alignment constraints based on pixeltype (from 1 to 8 bytes per byte,
> so requiring different padding).

It may be worth you playing with the testmem.c harness in the above 
email to see whether it has a similar performance effect when accessing 
rasters.


ATB,

Mark.

-- 
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063



More information about the postgis-devel mailing list