[postgis-users] LWGEOM -- initial lwgeom.h file
David Blasby
dblasby at refractions.net
Tue Mar 2 10:22:39 PST 2004
Ralph,
Thanks for your thoughts and comments.
I like the LWLINE, LWPOINT, and LWPOLY types because they make all the
other functions much easier to read and write. The current postgis has
POINT3D, LINE3D, and POLYGON3D. You have a point about re-allocation
(which I'll address below) of the points.
Unfortunately, you cannot just stick a pointer into the serialized
form's points. The reason for this is memory alignment. Lets look at a
simple serialized form example:
2D line String
<int32> size = ...
<char> type: S=0,D=0, tttt= 2
<uint32> npoints (3)
<double> X0
<double> Y0
<double> X1
<double> Y1
<double> X2
<double> Y2
And the equivelent C struct:
typedef struct
{
int32 size;
char type;
uint32 npoints;
POINT2D points[3];
} three_point_line;
You'd think you could just cast the serialized form into the
three_point_line type. Unfortunately, you can not. The actual
three_point_line type looks like more like this: (note - intel machines
are 4-byte aligned and solaris is 8-byte aligned)
typedef struct
{
int32 size;
char type;
byte junk1; // intel and solaris
byte junk2; // intel and solaris
byte junk3; // intel and solaris
uint32 npoints; // properly aligned
byte junk4; // solaris only
byte junk5; // solaris only
byte junk6; // solaris only
byte junk7; // solaris only
POINT2D points[3]; // properly aligned
} three_point_line;
In the serialied form, X0 is 9 bytes into the structure. If you try
something like this in solaris, you'll immediatly segfault due to
miss-alignment:
*((double *) &serialized_form[9])
The solution to this is either force the structure to be memory aligned
(this is what postgis currently does) - but then you're wasting space in
the database, or you can copy the points to a new structure that is
properly aligned (which is what I proposed for the lwgeom) - but then
you're wasting time copying.
Using the serialized form's points directly means *all* the functions
have to be aware of 2D and 3D points, leading to the functions being
twice as complex as they need to be.
There is another alternative. We can abstract the point list so it
handles the 2d/3d distinction and alignment issues.
typedef struct
{
char *serialized_pointlist; // probably missaligned. 2d or 3d
char is3d; // true if these are 3d points
int32 npoints
} POINTARRAY;
We can form one of these by pointing directly into a portion of the
serialized form. We can easily add functions like:
// copies a point from the point array into the parameter point
// will set point's z=0 (or NaN) if pa is 2d
// NOTE: point is a real POINT3D *not* a pointer
extern void getPoint(POINTARRAY pa, int n, POINT3D point);
Doing this means we dont waste any memory and we abstract all our point
lists behind a single interface.
I'm a little confused as to what you mean by having only one type and
being able to use one bounding box function. Could you explain a little
more?
How is the bounding box finding function going to compute the bounding
box of a multilinestring object and a polygon object without having
functions that work on lines, point, and polygons?
> I am not sure I understand why bounding box can not be calculated and
> stored when a geometry goes over a given size? Then the above function
> can copy when one exists and calculate if not.
I think putting the bounding box inside the geometry isnt all that
helpful. Its only helpful for geometries with a large number of points.
After 125 2d points (or 80 3d points), the geometry will be TOASTed
anyways (see my message on TOASTing). This means that time to take it
off the disk is already quite high (pull placeholder from main table,
lookup TOAST info in the toast table, pull TOASTed tuple from disk), so
the computation time is very low in comparision. The time it takes to
compute the bounding box on a 120,000 point polygon is very very small -
esp in comparision to taking 1000 pages off the disk.
I had orginally thought the bounding box inside the geometry would be
helpful, but I'm skeptical now. NOTE: the index will contain the
bounding box.
> While we are talking about this, I suggest a standard flex bison/parser
> for WKT, the parser can pretty easily output a LW_GEOM with a bounding
> box when it exceeds a given threashold. I would be happy to put this
> together.
WOOT! This would be great! WOOT!
Have to run,
dave
More information about the postgis-users
mailing list