[postgis-devel] Autofix and retry for GEOS overlay ops

Darafei "Komяpa" Praliaskouski me at komzpa.net
Sat Feb 10 01:21:45 PST 2018


Hi,

I was thinking about "mysql doing the right thing with disastrous results"
thesis for a couple of days.
I think geometry datatype is already different in this to all the other
builtin ones.

Let's think about date. In mysql you can store '0000-00-00 00:00:00' as
timestamp, and that just won't be parsed by postgres. That's defensive, and
makes sure you won't get an invalid tuple to database.
With geometry validity, it's not held: you can already insert a
butterfly-shaped polygon into database without questions, so it's already
not that level of defense.

PostGIS already does some of the overlay operations on invalid geometries
"disastrously invalidly" silently. Here's example:

select ST_Intersects('POLYGON((0 0, 1 1, 1 0, 0 1, 0 0), (2 2, 3 3, 4 0, 2
2))'::geometry, 'POINT(3 3)');

It's a hole outside shell, that could be another new shell after fixing
whatever way, but a point lying on that shell is not a part of geometry, as
polygon's box is only calculated from outer shell.
Many people would expect a point mentioned in the geometry definition to be
a part of that geometry anyway :)

So, we're already at the place where an invalid geometry can get not a
desired answer. I'm pretty sure there are a lot of cases where PostGIS
gives an answer for an operation on invalids, mostly on all the
short-circuits. Indexes are already wrong for some invaild cases.

So if we'd like to really do something about invalidity, a "strict mode"
can help - check validity on output of any geometry-returning function,
including input parsers. Such a change would make people who want to be
sure they get an exception in case something is doubtful happy. It will
also break some clever hacks like ST_Buffer(ST_Collect(),0) and make sure
you will be debugging GEOS every day for at least a year. :)

JSON path operations, according to this year's pgconf.ru talks, have two
options: `lax` relaxed mode, that autoconverts all the arrays of one
elements to element to object to whatever to match your description, and
vice versa, and `strict` mode that takes the exact element by the path you
provided. `lax` is default, which sets precedent of context-guessing
operations being OK in SQL.

чт, 8 февр. 2018 г. в 16:24, Sandro Santilli <strk at kbt.io>:

> On Sun, Feb 04, 2018 at 03:00:37PM +0000, Darafei "Komяpa" Praliaskouski
> wrote:
>
> > which boils down to "try intersecting, if failed, MakeValid the inputs,
> try
> > intersecting, fail if failed".
>
> I'd say "if failed *and* input is invalid".
> GEOS already checks the input upon hitting a topology exception,
> and reports that with something like this:
>
>     ERROR:  Error performing intersection: TopologyException:
>     Input geom 0 is invalid:
>     Self-intersection at or near point 5 5 at 5 5
>
> So maybe you could parse that exception message to tell if
> it's worth running a MakeValid.
>
> You still want to emit a WARNING because the caller might prefer
> to fix the inputs earlier in the process, for speed benefits
> (for example I think ClipByBox is documented for possibly returning
> invalid geometries, so a user would need to MakeValid those outputs)
>
> A GUC to control whether or not to activate some behavior would
> be useful (although painful to use/maintain). I'd also like to be
> able to request a NULL in preference to an exception, for example,
> as that's an easiest way to find out _which_ inputs are problematic
> when dealing with big tables with lots of data...
>
>
> --strk;
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20180210/cd05d2cc/attachment.html>


More information about the postgis-devel mailing list