[postgis-users] Light WeightLight Weight Geometry (LWGEOM) Proposal

Sun Feb 22 14:11:41 PST 2004

Hi David,

This looks very nice to me. Remind me why anyone would want to use the 
'full' geometry when this one is in place?

Anyhow a couple of suggestions:

Be able to use the extra bit to specify floats for a bounding box.  I 
think this would almost always be a good optimization.
A way to specify that the data is in floats not doubles.  Perhaps using 
the extra space in the geometry type field.

Another Idea, use the spare bit to indicate an extended type byte 
(coming after the type byte).

This could be something could have extra flags if present.

Bounding box data type
Float bounding box
Int16    bounding box
Int32   bounding box

Data Type
Float data
int 16 data
int32 data

Also, since I'm here -  in the beginning there would be lots of 
LWGEOM->WKB->PostGIS GEOMETRY conversions, perhaps a direct 
LWGEOM->PostGIS GEOMETRY would be a nice little optimization.

Ralph

David Blasby wrote:

> As per our current discussions, I'm proposing a new Light-Weight 
> Geometry.  This will be 'as-well-as' the current PostGIS so no one 
> will lose anything.  I'm not going to back-port this, so we'll only 
> support it in postgresql 7.4+.
>
> Disk Representation (serialized form)
>
> int32 size; //postgresql variable-length requirement
> char  type; // this type (see below)
> <data>
>
> Where the 8-byte 'type' is defined bit-wise as:
>
> xSBDtttt
>
> WHERE
>     x = unused
>     S = 4 byte SRID attached (0= not attached (-1), 1= attached)
>     B = bounding box attached (0=no, 1=yes) (32 bytes)
>     D = dimentionality (0=2d, 1=3d)
>     tttt = actual type (as per the WKB type):
>     
>     enum wkbGeometryType {
>         wkbPoint = 1,
>         wkbLineString = 2,
>         wkbPolygon = 3,
>         wkbMultiPoint = 4,
>         wkbMultiLineString = 5,
>         wkbMultiPolygon = 6,
>         wkbGeometryCollection = 7
>     };
>
>
> In general, data will be exactly like the 3d-extended WKB 
> representation (except there's no endian flag and the WKB type is 
> defined as above).
>
> The bounding box flag is an optional component for large geometries - 
> for small (<1000 point geometries) it will not be present.  This 
> allows for small storage of small geometries (where bounding boxes can 
> be quickly calculated on-the-fly) but enhanced performance for large 
> geometries.  This will probably be a compile-time option.
>
>
> Examples (c.f. OGC SF SQL defintion of WKB (section 3.3.2.6))
> -------------------------------------------------------------
>
> A. 2D point w/o bounding box
>
> <int32> size  = 21 bytes
> <char> type:  S=0,B=0,D=0, tttt= 1
> <double> X
> <double> Y
>
> B. 3D point w/o bounding box
>
> <int32> size = 29 bytes
> <char> type:  S=0,B=0,D=1, tttt= 1
> <double> X
> <double> Y
> <double> Z
>
>
> C. 2D point WITH bounding box
>     (you would never put on with a points, but this is just an example)
>
> <int32> size = 53 bytes
> <char> type:  S=0,B=1,D=0, tttt= 1
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <double> X
> <double> Y
>
>
> D. 2D line String w/o bounding box
>
> <int32> size = npoints*16 + 9
> <char> type:  S=0,B=0,D=0, tttt= 2
> <uint32> npoints
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ...
>
> E. 2D line String with bounding box
>
> <int32> size = npoints*16 + 9 + 32
> <char> type:  S=0,B=1,D=0, tttt= 2
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <uint32> npoints
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ...
>
> F. 3D polygon w/o bounding box
>    NOTE: I havent explicitly put in the ogcLinearRing
>
> <int32> size =
> <char> type:  S=0,B=1,D=0, tttt= 3
> <uint32> nrings
> <uint32> npoints in ring1
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ...
> <uint32> npoints in ring3
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ...
> ...
>
>
> G. 2d multilines string w/o bounding boxes
>
>     NOTE: this is like the OGC spec - we duplicate type info
>           in the sub-geometries.  This is arguably not a good idea,
>           but it does allow us to treat all the multi* and
>           geometrycollection types equivelently.
>           It also allows us to represent GeometryCollections of
>           GeometryCollections (which postgis doesnt support).
>
> <int32> size =
> <char> type:  S=0,B=0,D=0, tttt= 5
> <uint32> nlines
> <char> type:  S=0,B=0,D=0, tttt= 2
> <uint32> npoints in line 1
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ....
> <char> type:  S=0,B=0,D=0, tttt= 2
> <uint32> npoints in line 2
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ....
>
>
> G. 2d multilines string with main bounding box
> <int32> size =
> <char> type:  S=0,B=0,D=0, tttt= 5
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <uint32> nlines
> <char> type:  S=0,B=0,D=0, tttt= 2
> <uint32> npoints in line 1
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ....
> <char> type:  S=0,B=0,D=0, tttt= 2
> <uint32> npoints in line 2
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ....
>
> G. 2d multilines string with main and sub bounding boxes
>     NOTE: since our types are defined recursive manner, this
>           type is possible.  I dont think we should construct them in
>           general.
>
> <int32> size =
> <char> type:  S=0,B=0,D=0, tttt= 5
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <uint32> nlines
> <char> type:  S=0,B=1,D=0, tttt= 2
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <uint32> npoints in line 1
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ....
> <char> type:  S=0,B=1,D=0, tttt= 2
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <uint32> npoints in line 2
> <double> X1
> <double> Y1
> <double> X2
> <double> Y2
> ....
>
> H. 2D point w/o bounding box (with SRID)
>
> <int32> size  = 25 bytes
> <char> type:  S=1,B=0,D=0, tttt= 1
> <int32> SRID
> <double> X
> <double> Y
>
> I. 3D point w/o bounding box (with SRID)
>
> <int32> size = 33 bytes
> <char> type:  S=1,B=0,D=1, tttt= 1
> <int32> SRID
> <double> X
> <double> Y
> <double> Z
>
>
> J. 2D point WITH bounding box (with SRID)
>     (you would never put on with a points, but this is just an example)
>     (note: SRID comes before bounding box)
>
> <int32> size = 57 bytes
> <char> type:  S=1,B=1,D=0, tttt= 1
> <int32> SRID
> <double> xmin
> <double> ymin
> <double> xmax
> <double> ymax
> <double> X
> <double> Y
>
>
>
>
> Other notes:
>
> Cannonical form will be very much like the current WKB type (ie. looks 
> like '000000FF001A..').  This means your pg_dumps will look strange, 
> but you'll find it faster and there will not be any numberic drift as 
> you move to and from WKT.
>
> We'll need to write a WKB to LWGEOM and LWGEOM to WKB.  Since LWGEOM 
> is very close to WKB, this should be simple.
>
> To get to WKT, we can do a LWGEOM->WKB->PostGIS GEOMETRY->WKB.  For 
> parsing WKT, we can WKT->PostGIS GEOMETRY->WKB->LWGEOM.  The PostGIS 
> conversion functions already exist, so this will be very easy.
>
> We'll also need to write something to convert a LWGEOM to a bounding 
> box plus all the indexing support functions (based on BOX2DFLOAT4s).
>
> One of the design issues I have with PostGIS is that all the analysis 
> functions deal directly with the serialized GEOMETRY form.  This makes 
> them more complex and difficult to maintain.  I suggest we have 
> soup-up versions of the current PostGIS geometry types (i.e. 
> POLYGON3D, LINESTRING3D, POINT3D) for LWGEOM (i.e. LW_POLYGON, 
> LW_LINE, LW_POINT) which would hide things like 2d vs 3d and make 
> construction easier.
>
> What think?
>
> dave
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users