[postgis-devel] Call for 1.4.2 and 1.5.1
Mark Cave-Ayland
mark.cave-ayland at siriusit.co.uk
Wed Feb 17 09:22:44 PST 2010
strk wrote:
> I guess we can debate this one.
> What options would you have to clean unclosed rings if you have them
> in a shapefile ? That's what you bought (let's assume). Maybe it is
> a single geometry out of a whole lot. You go importing and the whole
> set bails out due to an exception thrown by importer.
> Not helpful, is it ?
>
> After all isn't that invalidity comparable with other kind of
> invalidities like self-intersections and such ? Shall we refuse
> to import anything not valid ?
>
> I think it makes more sense to let invalid geoms in and expose methods
> to find invalids and clean them up.
Currently there are 4 checks in the parser: minimum number of points,
odd number of points (only for certain objects), polygon closure and
continuity. I feel that these are reasonable checks for getting
information into the database for the following reasons:
i) They ensure that polygon rings are closed (i.e. a valid area can be
calculated for any polygon within the database without modification) and
that all linestrings are continous (i.e. a valid length can be
calculated for any linestring)
ii) They capture simple errors from beginners, e.g. polygons without
closed rings and incontinuous curves such as those below:
POLYGON((0 0, 0 1, 1 1, 1 0))
COMPOUNDCURVE(CIRCULARSTRING(0 0, 1 1, 1 0),(1 1, 0 1))
iii) By ensuring that odd numbers of points are used where required, it
guarantees that all curves containing arcs can be generated correctly.
I don't think that these checks are entirely unreasonable at all. If I
had paid money for data that didn't meet these criteria then I would
feel within my rights to complain that it wasn't fit for purpose. By
guaranteeing that we can always calculate area for rings and lengths for
linestrings, then that alone will help prevent triggering strange errors
in client applications.
In a way, I see it similar to being like the UTF-8 validation in
PostgreSQL as it stands - at the end of the day, it can sometimes be a
pain when you find validation errors in your data during import but
fixing it during import will save you a heck of a lot of trouble further
down the line.
In terms of loosening the parser to allow already invalid geometries out
of the database, it's a shame we have to do this but it was discussed at
the time the parser was tidied up that this may be required. In
particular, some versions of GEOS were found to calculate incorrect Z
coordinates for some predicates.
So in summary, I feel that the current balance is about right: we catch
simple errors/typos that would prevent the basic PostGIS calculation
functions from returning incorrect results but allow everything else to
be imported. Hence topological problems that prevent use with the proper
spatial predicates can be detected with ST_IsValidReason() and friends,
and potentially corrected as PostGIS improves in this area.
ATB,
Mark.
--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063
Sirius Labs: http://www.siriusit.co.uk/labs
More information about the postgis-devel
mailing list