[postgis-devel] Re: [postgis-users] LWGEOM -- initial lwgeom.h file
Ralph Mason
ralph.mason at telogis.com
Wed Mar 3 17:22:21 PST 2004
David Blasby wrote:
> Ralph,
>
> Thanks for your thoughts and comments.
>
> I like the LWLINE, LWPOINT, and LWPOLY types because they make all the
> other functions much easier to read and write. The current postgis
> has POINT3D, LINE3D, and POLYGON3D. You have a point about
> re-allocation (which I'll address below) of the points.
>
> Unfortunately, you cannot just stick a pointer into the serialized
> form's points. The reason for this is memory alignment. Lets look at
> a simple serialized form example:
I see what you mean, excuse my x86ed ness. I sometimes forget.
> Using the serialized form's points directly means *all* the functions
> have to be aware of 2D and 3D points, leading to the functions being
> twice as complex as they need to be.
>
I think you should be able to get 3d points from 2d but you should still
be able to get 2d points also. So you have the option of writing a fast
2d only function.
> There is another alternative. We can abstract the point list so it
> handles the 2d/3d distinction and alignment issues.
>
> typedef struct
> {
> char *serialized_pointlist; // probably missaligned. 2d or 3d
> char is3d; // true if these are 3d points
> int32 npoints
> } POINTARRAY;
>
We can form one of these by pointing directly into a portion of the
serialized form. We can easily add functions like:
>
> // copies a point from the point array into the parameter point
> // will set point's z=0 (or NaN) if pa is 2d
> // NOTE: point is a real POINT3D *not* a pointer
> extern void getPoint(POINTARRAY pa, int n, POINT3D point);
>
> Doing this means we dont waste any memory and we abstract all our
> point lists behind a single interface.
>
This is really where I was trying to go - no allocations /
deallocations, less code, you just get a point copied onto the stack.
And you can get a 2d or 3d point. The point array can also be
allocated on the stack.
>
> I'm a little confused as to what you mean by having only one type and
> being able to use one bounding box function. Could you explain a
> little more?
> How is the bounding box finding function going to compute the bounding
> box of a multilinestring object and a polygon object without having
> functions that work on lines, point, and polygons?
If there is only one type (LW_GEOM) then there only needs to be one
bounding box function - internally it must know how to calculate the box
for the different geometries, but the programmer only has one function
to bother with.
>> I am not sure I understand why bounding box can not be calculated and
>> stored when a geometry goes over a given size? Then the above
>> function can copy when one exists and calculate if not.
>
And the penny drops about the earlier email.
> I think putting the bounding box inside the geometry isnt all that
> helpful. Its only helpful for geometries with a large number of
> points. After 125 2d points (or 80 3d points), the geometry will be
> TOASTed anyways (see my message on TOASTing). This means that time to
> take it off the disk is already quite high (pull placeholder from main
> table, lookup TOAST info in the toast table, pull TOASTed tuple from
> disk), so the computation time is very low in comparision. The time
> it takes to compute the bounding box on a 120,000 point polygon is
> very very small - esp in comparision to taking 1000 pages off the disk.
>
> I had orginally thought the bounding box inside the geometry would be
> helpful, but I'm skeptical now. NOTE: the index will contain the
> bounding box.
The main ideas I was trying to convey.
1. Be able to do things without heap allocation and deallocation
Discussed above.
2. That is should be possible to only use one type LW_GEOM, perhaps
there is some 'context' that is able to be initialized for speed, and
that is passed around. This could store an error state.
example
double line_length2d(LW_GEOM_CONTEXT *line)
{
int i;
POINT2D frm, to;
double dist = 0.0;
//Some end thing here can say - Expected a 2d line
if ( VERIFY_DATATYPE(line,LINE2D) )
return 0.0;
int num_points = LINE_NUMPOINTS(line);
if ( num_points <2 )
return 0.0; //must have >1 point to make sense
LINE2D_GETPOINT(line,0,&frm)
for (i=1; i<num_points;i++)
{
LINE2D_GETPOINT(line,i,&to)
dist += sqrt( ( (frm->x - to->x)*(frm->x - to->x) ) +
( (frm->y - to->y)*(frm->y - to->y) ) );
frm = to;
}
return dist;
}
Finally before the return to postgres there is a macro called something
like RETURN_ERROR which returns from the function with an error message
if one is set in the context.
eg
Datum some_func(PG_FUNCTION_ARGS)
{
LW_GEOM *gem = (GEOMETRY *) PG_DETOAST_DATUM(PG_GETARG_DATUM(0));
LW_GEOM_CONTEXT context;
LW_INIT_CONTEXT(geom,& context );
//Do some processing
double retval = line_length2d(context)
RETURN_ERROR(context); //Returns only if there is an error and
returns the message
PG_RETURN_FLOAT8(retval );
}
The context and marcos can change and the code should keep working.
They can also have parts that are conditional on architecture is necessary.
Anyway - just another view on it all.
Ralph
More information about the postgis-devel
mailing list