[postgis-devel] Call for 1.4.2 and 1.5.1

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Wed Feb 17 09:22:44 PST 2010


strk wrote:

> I guess we can debate this one.
> What options would you have to clean unclosed rings if you have them
> in a shapefile ? That's what you bought (let's assume). Maybe it is
> a single geometry out of a whole lot. You go importing and the whole
> set bails out due to an exception thrown by importer.
> Not helpful, is it ?
> 
> After all isn't that invalidity comparable with other kind of
> invalidities like self-intersections and such ? Shall we refuse
> to import anything not valid ?
> 
> I think it makes more sense to let invalid geoms in and expose methods
> to find invalids and clean them up.

Currently there are 4 checks in the parser: minimum number of points, 
odd number of points (only for certain objects), polygon closure and 
continuity. I feel that these are reasonable checks for getting 
information into the database for the following reasons:


i) They ensure that polygon rings are closed (i.e. a valid area can be 
calculated for any polygon within the database without modification) and 
that all linestrings are continous (i.e. a valid length can be 
calculated for any linestring)

ii) They capture simple errors from beginners, e.g. polygons without 
closed rings and incontinuous curves such as those below:

	POLYGON((0 0, 0 1, 1 1, 1 0))
	COMPOUNDCURVE(CIRCULARSTRING(0 0, 1 1, 1 0),(1 1, 0 1))

iii) By ensuring that odd numbers of points are used where required, it 
guarantees that all curves containing arcs can be generated correctly.


I don't think that these checks are entirely unreasonable at all. If I 
had paid money for data that didn't meet these criteria then I would 
feel within my rights to complain that it wasn't fit for purpose. By 
guaranteeing that we can always calculate area for rings and lengths for 
linestrings, then that alone will help prevent triggering strange errors 
in client applications.

In a way, I see it similar to being like the UTF-8 validation in 
PostgreSQL as it stands - at the end of the day, it can sometimes be a 
pain when you find validation errors in your data during import but 
fixing it during import will save you a heck of a lot of trouble further 
down the line.

In terms of loosening the parser to allow already invalid geometries out 
of the database, it's a shame we have to do this but it was discussed at 
the time the parser was tidied up that this may be required. In 
particular, some versions of GEOS were found to calculate incorrect Z 
coordinates for some predicates.

So in summary, I feel that the current balance is about right: we catch 
simple errors/typos that would prevent the basic PostGIS calculation 
functions from returning incorrect results but allow everything else to 
be imported. Hence topological problems that prevent use with the proper 
spatial predicates can be detected with ST_IsValidReason() and friends, 
and potentially corrected as PostGIS improves in this area.


ATB,

Mark.

-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs



More information about the postgis-devel mailing list