[postgis-devel] OGC WKB/T - PGIS EWKB/T
strk at refractions.net
strk at refractions.net
Wed Dec 22 03:37:27 PST 2004
With help and support of Mark Cave-Ayland and Markus Schaber
I've (conceptually) separated OGC-strict WKB and WKT formats
from Postgis extended versions EWKB and EWKT.
Here is a clarification of the new status, in a format suitable
for the postgis manual, below you'll find implementation notes.
---8<----------------------------------------------------------
OGC formats only support 2d geometries, and the associated SRID
is *never* embedded in the input/output representations.
Postgis extended formats are currently superset of OGC one (every
valid WKB/WKT is a valid EWKB/EWKT) but this might vary in the
future, specifically if OGC comes out with a new format conflicting
with our extensions. Thus you SHOULD NOT rely on this feature!
Postgis EWKB/EWKT add 3dm,3dz,4d coordinates support and embedded
SRID information.
Input/Output of these formats are available using the following
interfaces:
- OGC -
bytea WKB = asBinary(geometry);
text WKT = asText(geometry);
geometry = GeomFromWKB(bytea WKB, [SRID]); // will WARN on EWKB
geometry = GeometryFromText(text WKT, [SRID]); // will WARN on EWKT
- PGIS -
bytea EWKB = asEWKB(geometry);
text EWKT = asEWKT(geometry);
geometry = GeomFromEWKB(bytea EWKB);
geometry = GeomFromEWKT(text EWKT);
The "canonical forms" of a PostgreSQL type are the representations
you get with a simple query (without any function call) and the one
which is guaranteed to be accepted with a simple insert, update or
copy. For the postgis 'geometry' type these are:
- Output -
binary: EWKB
ascii: HEXEWKB (EWKB in hex form)
- Input -
binary: EWKB
ascii: HEXEWKB|EWKT
---8<----------------------------------------------------------
A few notes about the implementation.
Since our WKT/WKB parsers are really *just* EWKT/EWKB parsers
the implementation RELIES on the future that users SHOULD NOT
RELY ON: superset nature of EWKT/EWKB in respect to WKT/WKB.
This means that the representations are always parsed as EWKT/EWKB
and a second-step check is made for the presence of an embedded
SRID or higher dimensions in the resulting geometry.
When these are found the OGC-strict routines currently WARN the
user about the fact that the corresponding extended input functions
should be used. We could make that an ERROR instead, completely
*forbidding* EWKB/EWKT in OGC constructors functions.
Similarly, output functions are just LWGEOM to EKWT/EKWB converters,
so the OGC-strict output procedures just *drop* higher dimensions and
SRID before feeding the LWGEOM to them, obtaining a subset of
EWKT/EWKB being WKT/EKB.
As long as WKB/WKT remains a subset of EWKT/EWKB we don't have to
worry about this, but in case they loose this nature we'll have
to implement OGC-strict-only parsers/unparsers.
Finally, it has to be noted that all this layers of processing
reduce the performance of input/outputs operations. Here is a
table of input and output functions ordered by estimated speed
(number of layers of processing).
- Input -
1 canon. ascii HEXEWKB
1 canon. ascii EWKT
1 GeomFromEWKT() - calls canon. ascii EWKT
2 canon. binary EWKB - converts to HEXEWKB first (can be improved)
2 GeomFromEWKB() - calls canon. binary EWKB
3 GeomFromWKB() - calls canon. binary EWKB, checks OGC conformance
3 GeometryFromText() - calls canon. ascii EWKT, checks OGC conformance
- Output -
1 canon. ascii HEXEWKB
1 asEWKT()
2 canon. binary EWKB - converts HEXEWKB to binary (can be improved)
2 asEWKB() - calls canon. binary EWKB
3 asBinary() - drops SRID and ZM, calls canon. binary EWKB
3 asText() - drops SRID and ZM, calls asEWKT()
There are also other issues that might be worth discussing, but I think
we already have enough meat on the fire (as we say in Italy).
Merry Christmas !
--strk;
More information about the postgis-devel
mailing list