[postgis-users] EMPTY geometries

Thu Apr 10 14:37:39 PDT 2003

Paul and I have been looking at PostGIS conformance, and I've been 
looking at empty geometries.

1. The OGC SF SQL spec doesnt seem to deal with empty geometries - it 
calls them "the empty set".  It appears that it would be fine with a 
single "NULL" geometry.

2. WKT is really loosy-goosey about EMPTY geometries.  All the base 
types can be empty (POINT, LINESTRING, MULTIPOLYON, etc...), but also 
components of the types can be EMPTY.  For example:
MULTILINESTRING( EMPTY, (EMPTY, EMPTY, EMPTY) )
represents two linestrings - an empty one, and one made up of 3 empty 
points.

3. WKB isnt as loosy-goosey as WKT, but fairly free formed.  You cannot 
represent an EMPTY point in WKB.  You can have empty objects like 
linestrings with 0 points, or multilinestrings with 0 linestrings, or 
multilinestrings with empty linestrings inside.

Currently PostGIS does not allow empty geometries at all.  It uses the 
SQL-NULL to represent them.

----

I dont really see any advantage to supporting the nasty empty 
geometries.  They dont really add anything, and the specification say
that you cannot make empty geometries:

"The instantiable subclasses of Geometry defined in this specification 
are restricted to 0, 1 and twodimensional
geometric objects that exist in two-dimensional coordinate space."

Empty geometries do not have a dimension.

Also, the diagram (Figure 2.1) says that objects like LINESTRING have 2+ 
points (not 0, or 2+). And MULTILINESTRING has 1+ linestrings in it.

But, in keeping with the spirit of the spec, I see two ways of dealing 
with this:

A. Simple one-empty geometry

Have a single empty geometry type.  Its WKT representation would be 
"GEOMETRYCOLLECTION( EMPTY )" and its WKB representation would also
be an empty geometrycollection.  Any WKT with an empty in it would be
converted to this type:

LINESTRING( EMPTY) -> GEOMETRYCOLLECTION( EMPTY )
LINESTRING(0 0, EMPTY, 1 1) -> GEOMETRYCOLLECTION( EMPTY )
MULTILINESTRING( (0 0, 1 1), EMPTY) -> GEOMETRYCOLLECTION( EMPTY )

* NB: a valid geometry was thrown away in the last case.  We might
       want this to return a MULTILINESTRING( (0 0, 1 1)), but that
       would take a bit of time to fix with the parser.

Any GEOS function that returns an empty set would have its results
represented with this.

difference(g,g) -> GEOMETRYCOLLECTION( EMPTY )
intersection( disjoin geoms) -> GEOMETRYCOLLECTION( EMPTY )

The main problem with this is the loss of type information

	LINESTRING( EMPTY) -> GEOMETRYCOLLECTION( EMPTY )
     The type of the geometry has changed from LINESTRING to a GC.
     This could cause problems with tables that have constraints
     on them to ensure they have a homogeneous type.
     This fixable by having the contraint on the table be:
	isnull(geom) || geometryType(geom) = 'LINESTRING' ||
         isEmpty(geom)

     GeometryType( <empty geometry>) -> GEOMETRYCOLLECTION

A second problem is that the location of sub-geometries in MULTI* 
geometries will change:

GeometryN(MULTILINESTRING( (0 0, 1 1), EMPTY, (2 2, 3 3)), 2)
	should be LINESTRING EMPTY, but in our implementation
         it would either be an error, or LINESTRING (2 2, 3 3).

B. Typed empty geometries

This involves adding 7 new types (cf. postgis.h)
EMPTY_POINT
EMPTY_MULTIPOINT
EMPTY_LINESTRING
EMPTY_MULTILINESTRING
EMPTY_POLYGON
EMPTY_MULTIPOLYGON
EMPTY_GEOMETRYCOLLECTION

NB: EMPTY_POINT is not representable in WKB, so it would have
     to be converted to an EMPTY_MULTIPOINT.

Internal GEOS functions would only create EMPTY_GEOMETRYCOLLECTION.

The only real advantage of this is that users can create empty 
geometries of a known type (ie. POINT EMPTY is different from LINESTRING 
EMPTY).
This isnt too much more work than the single-type version.

C. Typed empty geometries with actual empty sub-types

This is very much like the WKB system.

Make geometries like LINESTRING valid with 0 points (representing 
"LINESTRING EMPTY").  Make MULTI* valid with 0 sub-objects.
Make an EMPTYPOINT type.

This means we're fully able to represent all the geometries except 
screwy things like "LINESTRING( EMPTY, 1 2, EMPTY, 3 3)".

In terms of work, "A" is faily simple, "B" is about 3* as much work, and 
  "C" could take quite a while to ensure everything is 0-point and 
0-object aware.

In terms of conformance, "A" seems like its all we need.  The spec looks 
like it only requires some type of "empty set" geometry.  The other two 
methods are just there to be nice for the WKT and WKB specification. 
I'm loath to mess up to code to allow for empty objects that people will 
probably never use.

What do you think?  Anyone have any different impressions of the 
specification?  Anyone have any use for the funnier empty geometries?

dave