[postgis-devel] AutoFix Again

Darafei "Komяpa" Praliaskouski me at komzpa.net
Tue Jul 24 14:21:19 PDT 2018


First things first: invalidity is mostly about different algorithms with
different optimizations being allowed to interpret which set of points
geometry covers upon its representation differently. All the parameters
should be about choosing which interpretation should be preferred in case
of ambiguity.

For my understanding, a whole new concept is needed to handle it nicely:
dataset properties.

Closest thing to it we have currently is SRID: a number in every geometry
that says how you should interpret it. Now it is limited to how coordinates
relate to space, stored by a foreign key encoded into some bits into
geometry field. It can also hold tolerance, and an ordered list of fixups
that should be applied to invalid geometries first.

Different handling of invalids may be needed depending on what is the
history of dataset:

 - did it come from direct reading role= tags in relations of OpenStreetMap
data? Then a hole outside shell may mean non-closed shell in partial
dataset, and you need to drop all such holes. And not convert them all into
new shells, as sometimes happens with big border-crossing forests.

 - did it come from some stupid constructor that just dumps all the rings
as shells? Then you may need to perform shell-hole mapping once again.

 - did it come from some old hole-less software? then overlapping edges
should be removed and converted into holes, or multi-geometries.

 - was it pen-digitized? then you likely want all self-intersections to be
just plainly cleaned up from your dataset, not converted into super-small
holes that happen on sharp angles. What was resolution of the digitizing
pad, maybe it should go into Tolerance of each geometry upon creation?

I have not yet seen user story where someone needs an invalid geometry to
be NULL. It sounds like C-bound thinking to me, so I would like to learn
about such case.

But, these all are edge cases. There are a lot of cases of "Trivial
invalidity" in the wild:

 - ST_Collect used instead of ST_Union. Convert pixels of rasters into
boxes, collect boxes, try Intersects on it, whoops - TopologyException.
Touching buildings, slightly overlapping landcover polygons - and you can't
check spatial relation anymore. The cases like this are quite simple, and I
don't yet have an idea how can you interpret them incorrectly for things
like Intersects. IMO, these obvious cases shouldn't fail.

- ST_Simplify, ST_SnapToGrid, ST_QuantizeCoordinates: all of it are about
losing some data up to tolerance. You have a valid geometry, you do some of
these - poof - TopologyException.

- Non-closed rings, repeated points, touching rings - all this goes to the
"trivial" category too.

For trivial things, I would expect a robust software to "just work". How do
we achive it - either via autofix, or making GEOS more robust to such
cases, or inventing complex things like ST_SimplifyPreserveTolopogy that
will silently replace ST_Simplify - doesn't matter much from user's
prospective.

вт, 24 июл. 2018 г. в 16:55, Paul Ramsey <pramsey at cleverelephant.ca>:

> Just trolling tickets and reading/thinking, was thinking that the
> autofix question falls into two parts:
>
> - having a whole query interrupted by a single row problem is
> sometimes not desired
> - what *is* desired in that case is quite variable
>
> This is something for discussion perhaps at the sprint, but the
> problem of "policy" is going to keep coming up...
>
> - is a given behaviour a policy or a parameter?
>
> For example, does it make more sense to have
>
> set postgis.geos_exception_action = return_null;
> set postgis.geos_exception_action = attempt_repair;
> set postgis.geos_exception_action = fatal_error;
>
> Or, should that kind of thing be a parameter to each effected function?
> The same question will obtain for things like tolerance. If we add a
> tolerance module, is it a global setting, or is it a function
> parameter? Or, heaven help us, is it both?
> The whole question is further complicated by the issues we have had in
> the past with interactions between GUC and upgrades.
>
> Anyways, I feel like this stuff needs to be written out a little bit
> more, but maybe I'm just ignorant and it's already written out.
>
> P.
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel

-- 
Darafei Praliaskouski
Support me: http://patreon.com/komzpa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20180725/554e3ca2/attachment.html>


More information about the postgis-devel mailing list