[postgis-devel] OGC WKB/T - PGIS EWKB/T

strk at refractions.net strk at refractions.net
Wed Dec 22 03:37:27 PST 2004


With help and support of Mark Cave-Ayland and Markus Schaber
I've (conceptually) separated OGC-strict WKB and WKT formats
from Postgis extended versions EWKB and EWKT.

Here is a clarification of the new status, in a format suitable
for the postgis manual, below you'll find implementation notes.

---8<----------------------------------------------------------

OGC formats only support 2d geometries, and the associated SRID
is *never* embedded in the input/output representations.

Postgis extended formats are currently superset of OGC one (every
valid WKB/WKT is a valid EWKB/EWKT) but this might vary in the
future, specifically if OGC comes out with a new format conflicting
with our extensions. Thus you SHOULD NOT rely on this feature!

Postgis EWKB/EWKT add 3dm,3dz,4d coordinates support and embedded
SRID information.

Input/Output of these formats are available using the following
interfaces:

	- OGC -
	bytea WKB = asBinary(geometry);
	text WKT = asText(geometry);
	geometry = GeomFromWKB(bytea WKB, [SRID]); // will WARN on EWKB
	geometry = GeometryFromText(text WKT, [SRID]); // will WARN on EWKT

	- PGIS -
	bytea EWKB = asEWKB(geometry);
	text EWKT = asEWKT(geometry);
	geometry = GeomFromEWKB(bytea EWKB);
	geometry = GeomFromEWKT(text EWKT);

The "canonical forms" of a PostgreSQL type are the representations
you get with a simple query (without any function call) and the one
which is guaranteed to be accepted with a simple insert, update or
copy. For the postgis 'geometry' type these are:

	- Output -
	binary: EWKB 
	 ascii: HEXEWKB (EWKB in hex form)

	- Input -
	binary: EWKB
	 ascii: HEXEWKB|EWKT


---8<----------------------------------------------------------

A few notes about the implementation.

Since our WKT/WKB parsers are really *just* EWKT/EWKB parsers
the implementation RELIES on the future that users SHOULD NOT 
RELY ON: superset nature of EWKT/EWKB in respect to WKT/WKB.

This means that the representations are always parsed as EWKT/EWKB
and a second-step check is made for the presence of an embedded
SRID or higher dimensions in the resulting geometry.
When these are found the OGC-strict routines currently WARN the
user about the fact that the corresponding extended input functions
should be used. We could make that an ERROR instead, completely
*forbidding* EWKB/EWKT in OGC constructors functions.

Similarly, output functions are just LWGEOM to EKWT/EKWB converters,
so the OGC-strict output procedures just *drop* higher dimensions and
SRID before feeding the LWGEOM to them, obtaining a subset of
EWKT/EWKB being WKT/EKB. 

As long as WKB/WKT remains a subset of EWKT/EWKB we don't have to
worry about this, but in case they loose this nature we'll have
to implement OGC-strict-only parsers/unparsers.

Finally, it has to be noted that all this layers of processing
reduce the performance of input/outputs operations. Here is a 
table of input and output functions ordered by estimated speed
(number of layers of processing).

	- Input -

	1 canon. ascii HEXEWKB
	1 canon. ascii EWKT
	1 GeomFromEWKT() - calls canon. ascii EWKT

	2 canon. binary EWKB - converts to HEXEWKB first (can be improved)
	2 GeomFromEWKB()  - calls canon. binary EWKB

	3 GeomFromWKB()   - calls canon. binary EWKB, checks OGC conformance
	3 GeometryFromText() - calls canon. ascii EWKT, checks OGC conformance
	

	- Output -

	1 canon. ascii HEXEWKB
	1 asEWKT()

	2 canon. binary EWKB - converts HEXEWKB to binary (can be improved)
	2 asEWKB() - calls canon. binary EWKB

	3 asBinary() - drops SRID and ZM, calls canon. binary EWKB
	3 asText() - drops SRID and ZM, calls asEWKT() 


There are also other issues that might be worth discussing, but I think
we already have enough meat on the fire (as we say in Italy).

Merry Christmas !

--strk;



More information about the postgis-devel mailing list