[postgis-devel] Call for 1.4.2 and 1.5.1 (Handling of Invalid Geometries)

Martin Davis mbdavis at refractions.net
Wed Feb 17 11:36:05 PST 2010


I'm a bit late to the party here, so I might be missing some context.

I think it's useful to distinguish between structural invalidity and 
topological invalidity. 

Unclosed rings are structurally invalid.  This is trivial and fast to 
detect, and can be fixed automatically.

Topological invalidity is expensive to detect and complex or impossible 
to fix automatically.

I agree that users should be allowed to load topologically invalid data, 
but structurally invalid data seems like it's more trouble to handle 
than it's worth.  Is it not an option to simply always close rings on 
input to the database? 

If you allow structurally invalid rings into the database, I think you 
wind up with a situation where users never know if their data is OGC SFS 
compliant or not - which I think would do more harm than good to the 
repuation of PostGIS.

As a comparison, SDE is very heavy handed and always enforces 
topological validity.  This is going to far, IMO.  Oracle doesn't 
enforce topological invalidity - I'm not sure what it does in the case 
of structural invalidity.




Chris Hodgson wrote:
> strk wrote:
> > I guess we can debate this one.
>
> I believe this is the crux of the problem:
>
> 1) Spatial data with various levels of invalidity exists. Whether it 
> comes from shapefiles or other formats, whether people paid for it or 
> got it free, people have it and they want to use PostGIS with it.
>
> 2) It is a barrier to entry for those users who have invalid data and 
> want to move to using PostGIS, if we cannot accept their invalid data. 
> It may be that the data has been working just fine for them in other 
> systems and they don't understand or immediately care that anything is 
> wrong with it.
>
> 3) Some people would like to reduce that barrier to entry by providing 
> a way to load the invalid data. There are two ways to do this, clean 
> it before it gets in the database, or allow the invalid data in and 
> clean it after it gets in.
>
> 4) There is some minimum level of validity required in order for data 
> to be able to properly stored in the database, so there will always be 
> some invalid cases which will not be able to be loaded into the database.
>
> I am personally of the belief that letting invalid stuff into the 
> database is a good thing. Even if we can't actually do anything with 
> it in the database because it is so invalid, at least it can be 
> stored. Given that it is possible for functions to create invalid 
> output, we already know that it is possible for invalid stuff to be 
> given to other functions, so we have to handle those cases anyway. So 
> we will already be able to output these invalid geometries; why not 
> allow them to be input?
>
> If we agree that PostGIS should provide tools to help clean invalid 
> geometries, it seems to make sense for a spatial database to provide 
> these tools inside the database, not as external loader-helper tools. 
> This makes even more sense given that we may need to clean up invalid 
> geometries that are actually created within PostGIS.
>
> This does mean that we have to accept that even basic calculations 
> such as length() and area() will potentially fail - however, the user 
> with invalid data must not have the expectation of being able to use 
> these functions, as this is a problem inherent with their data. I'd 
> rather tell them  "invalid goemetry; can't calculate area" than 
> "invalid geometry; can't load into database".
>
> My 2 cents worth.
>
> Cheers,
> Chris
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
>

-- 
Martin Davis
Senior Technical Architect
Refractions Research, Inc.
(250) 383-3022




More information about the postgis-devel mailing list