[postgis-devel] Alignment Revision
Mark Cave-Ayland
mark.cave-ayland at siriusit.co.uk
Wed Jan 21 03:37:53 PST 2009
strk wrote:
>> In order to get the LWGEOM in/out of PostgreSQL we need to
>> serialize/deserialize the geometry. So what we are handed by PostgreSQL
>> is actually a *SERIALIZED* LWGEOM and so we must deserialize it into a
>> LWGEOM structure that we can actually do something useful with.
>
> Correct. We deserialize for covenience, but do not copy actual vertex
> data (doubles).
Excellent - so I understood this part ;)
>> The most
>> interesting part about this is that during the opposite process of
>> serialization, we have to iterate through any point arrays within a
>> LWGEOM anyway in order to memcpy() them into serialized form ready to
>> return to PostgreSQL.
>
> We don't memcpy each value here (at least we do not need to) but
> rather memcpy the whole pointarray data. No copying is required
> if the function implements a read-only operation.
Yup. Only functions that return (modified) geometries require the
overhead of serialization.
>> Copying the minimal amount of information from the head of a serialized
>> LWGEOM to a LWGEOM will be extremely quick, and so I'm not worried about
>> this. Therefore the key is to make the coordinate arrays double-aligned
>> within the PostgreSQL Datum so that during the deserialization process,
>> we can just point the POINTARRAY straight into the Datum - which is what
>> already happens.
>
> To rephrase, the key is to make the on-disk coordinate arrays
> double-aligned so that during inspection we can just cast every
> ordinate pointer to a double w/out incurring in either a performance
> penalty (x86) or misread (on architectures that do not allow reading
> a double from a non-double aligned memory address).
> Correct ?
Yes. Please see my original post and test harness here:
http://lists.refractions.net/pipermail/postgis-devel/2009-January/004473.html.
Even with a memcpy() from unaligned to aligned space, the speed
difference is quite impressive.
>> Hence all we have to do is:
>>
>> - Ensure the double arrays are aligned within the Datum
>> - Rewrite any accessor methods to iterate through the
>> array pointer directly, rather than through
>> getPoint_internal
>
> You mean ratehr then through getPoint_p here, as getPoint_internal
> is what returns a pointer into the POINTARRAY, which can be or not
> aligned depending on where the POINTARRAY comes from.
> If you allocated the POINTARRAY yourself it will be aligned, if
> it cams from the PostgreSQL Datum, it may or not be depending
> on how it was written.
Yes.
> This is an interesting thread as I proceed defining memory structures
> to use for holding rasters as in that case we'll have different
> alignment constraints based on pixeltype (from 1 to 8 bytes per byte,
> so requiring different padding).
It may be worth you playing with the testmem.c harness in the above
email to see whether it has a similar performance effect when accessing
rasters.
ATB,
Mark.
--
Mark Cave-Ayland
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk
T: +44 870 608 0063
More information about the postgis-devel
mailing list