[postgis-devel] Call for 1.4.2 and 1.5.1 (Handling of Invalid Geometries)
Chris Hodgson
chodgson at refractions.net
Wed Feb 17 10:30:06 PST 2010
strk wrote:
> I guess we can debate this one.
I believe this is the crux of the problem:
1) Spatial data with various levels of invalidity exists. Whether it
comes from shapefiles or other formats, whether people paid for it or
got it free, people have it and they want to use PostGIS with it.
2) It is a barrier to entry for those users who have invalid data and
want to move to using PostGIS, if we cannot accept their invalid data.
It may be that the data has been working just fine for them in other
systems and they don't understand or immediately care that anything is
wrong with it.
3) Some people would like to reduce that barrier to entry by providing a
way to load the invalid data. There are two ways to do this, clean it
before it gets in the database, or allow the invalid data in and clean
it after it gets in.
4) There is some minimum level of validity required in order for data to
be able to properly stored in the database, so there will always be some
invalid cases which will not be able to be loaded into the database.
I am personally of the belief that letting invalid stuff into the
database is a good thing. Even if we can't actually do anything with it
in the database because it is so invalid, at least it can be stored.
Given that it is possible for functions to create invalid output, we
already know that it is possible for invalid stuff to be given to other
functions, so we have to handle those cases anyway. So we will already
be able to output these invalid geometries; why not allow them to be input?
If we agree that PostGIS should provide tools to help clean invalid
geometries, it seems to make sense for a spatial database to provide
these tools inside the database, not as external loader-helper tools.
This makes even more sense given that we may need to clean up invalid
geometries that are actually created within PostGIS.
This does mean that we have to accept that even basic calculations such
as length() and area() will potentially fail - however, the user with
invalid data must not have the expectation of being able to use these
functions, as this is a problem inherent with their data. I'd rather
tell them "invalid goemetry; can't calculate area" than "invalid
geometry; can't load into database".
My 2 cents worth.
Cheers,
Chris
More information about the postgis-devel
mailing list