[postgis-devel] Postgis "Split" with tolerance

Rémi Cura remi.cura at gmail.com
Mon Feb 24 08:13:55 PST 2014


Hey list, Sandro, Paul.
Up ?


Short summary :* Allowing to split a line by a point *!
proposed change :
IN /postgis/liblwgeom/lwgeom_geos_split.c:190:
change
"if ( dist > 0)   /* TODO: accept a tolerance ? */"
by
//@TOTEST @CHECK change by Remi-C : adding a small tolerance
vstol = ptarray_length_2d(lwline_in->points) / 1e14; //note : 1e12 would be
better
if ( dist > vstol )   /* TODO: accept a tolerance ? */

This modification would allow to split line by points in most cases (see
test sql files in previous mail).

This modification allows to take care of a known numeric issue.

Cheers,

Rémi-C





2013-12-02 11:50 GMT+01:00 Rémi Cura <remi.cura at gmail.com>:

> Hey Paul ,
> thanks for the insight !
>
> I've thought about this, and this is true that this may be a parameter.
>
> We have a given number of digits (15 to 17), and some are spent on
> coordinates (I'd say 10 at most), so some remain to do precise computation.
> The problem is that the error we can allow (the epsilon) should depend on
> this. For example, if I use 10 digits, only 5 remains, and the error should
> be 5 digits.
> But if use 5 digits for coordinates, I could be much more precise and
> allow 10 digits to error.
> Maybe the (advanced) user could change this.
>
> Here is what I propose, and what I'll enforce for my work (the analogy is
> just for example):
> any point/line/poly should be precise up to 10 digits (2 points could be
> separated by 1000 km + 1 mm ).
> Any function working on geom should give good answer on 10 digits geom.
>
> More than this, geometry are equal except during computing (so we can use
> full digits during computing ).
>
> Remaining digits should be used to epsilon and like so as to give best
> possible answers.
>
>
> Please find the required test :
> the sql querry should return (true,true)
> 1.) the first true means that on 100 000 random line and random point on
> line, all line have been split in 2.
> 2.) the second true means that for this random lines, none was split by a
> point farther than epsilon from line.
> All testing are done on point/line having at most 10 digits coordinates
>
> Please note that for 1.) I didn't check than splitting occured at the
> right place, because I'm not testing split computing but chen it computes.
>
> This is base code where you can chose number of digits in coordinates,
> epsilon, number of tests to have
> The random is seeded so test can be reproduced.
> This could be reused to test more thoroughly other functions, and could be
> improved (lines are always segments, orientation not random).
>
> I also have a function which for a given line A and point P, gives the
> closest point to P on A with a given number of digits. A bit like a
> ST_SNapToGrid, but for point on line (in concept).
>
> Cheers,
> Rémi-C
>
>
>
> 2013/11/30 Sandro Santilli <strk at keybit.net>
>
>> On Fri, Nov 29, 2013 at 09:26:07AM -0800, Paul Ramsey wrote:
>> > Just to be a loose cannon in general: we have hard-wired tolerances
>> > all over the code base, just grep for EPSILON and see what pops out.
>> > We've hardly been ascetics about abusing tolerances where it suits our
>> > purposes. The proposal for a "split that 'works'" fits in with our
>> > general historical approach to tolerances that make things "work".
>>
>> Yeah, and I'll confess: I've injected some of that myself too,
>> recently. See r10973 ...
>>
>> What can I say Remi, give me a testcase and let's add to the mud :)
>>
>> --strk(I wanna be Anarchy);
>>
>> >
>> > P.
>> >
>> > On Fri, Nov 29, 2013 at 5:18 AM, Sandro Santilli <strk at keybit.net>
>> wrote:
>> > > On Fri, Nov 29, 2013 at 10:48:54AM +0100, Rémi Cura wrote:
>> > >> ST_Split(line,point) is currently broken.
>> > >> This is quite a neutral statement, because given a random point on a
>> random
>> > >> line, split won't do anything (p_correct_answer<1/10k )!
>> > >>
>> > >> The only work around I can think about are
>> > >> _ snap the line to the point : it implies moving the line, which is
>> not an
>> > >> option if you do it several times with different points, and is more
>> > >> abstractly a very bad idea
>> > >
>> > > This is what you'd be doing anyway by implementing a tolerance.
>> > > That is, the line would be moved anyway.
>> > >
>> > >> _ change the point to a very small line in the perpendicular
>> direction of
>> > >> the line we want to split, then use split(line,line). Again not a
>> very good
>> > >> option because you would possibiliy split the line in more than one
>> point
>> > >> (and anyway split(line,line) has also precision issues).
>> > >
>> > > Still this is a legit options.
>> > >
>> > >> I understand PostGIS is a big project and we have to be conservative,
>> > >> that is "we don't break a fine function with changes bringing unknown
>> > >> consequences for unproven benefits",
>> > >> but this function is broken, and we could change it to a function
>> working
>> > >> most of the time   !
>> > >
>> > > Yes, we could change it by accepting a tolerance value.
>> > > What's wrong with that ?
>> > >
>> > >> *Pros of this change :*
>> > >>
>> > >> _a line will be split by a point correctly (p_error < 10^6)
>> > >> _no heavy changes in code or methods
>> > >>
>> > >> *Cons of this change :*
>> > >>
>> > >> _the line would be split by any points close enough
>> (lengthofline/10^12).
>> > >> (that is all points less than 10 micro meters away from a line 1000
>> km long
>> > >> !).
>> > >
>> > > PostGIS doesn't deal with unit, so you don't know if those are
>> > > micro meters or micro peta meters.
>> > > You might be splitting a milli meter line by a point...
>> > >
>> > > All I'm saying is I'd like precision/tolerance to be _EXPLICIT_.
>> > > Particularely important when you're about to run a _set_ of operations
>> > > and you want them to be _consistent_ as per precision.
>> > >
>> > >> *I would also like adding ST_Split(geom,geom,tolerance).*
>> > >> In fact it could be an easy wrapper around ST_Split(geom,geom) if
>> this
>> > >> function was working. (something like: if the point is DWithin,
>> split the
>> > >> closest part of the line)
>> > >
>> > > Please go ahead. Such a function would be accepted.
>> > >
>> > > NOTE: TopoGeo_addPoint does take a tolerane, and does use ST_Split
>> after
>> > >       snapping the line to a DWithin point to implement it.
>> > >       Haing it available natively in C would speed that up.
>> > >
>> > > --strk;
>> > >
>> > >  ()  ASCII ribbon campaign        - against html e-mail
>> > >  /\  http://www.asciiribbon.org   - against proprietary attachments
>> > > _______________________________________________
>> > > postgis-devel mailing list
>> > > postgis-devel at lists.osgeo.org
>> > > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>> > _______________________________________________
>> > postgis-devel mailing list
>> > postgis-devel at lists.osgeo.org
>> > http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>>
>> --
>>
>>  ()  ASCII ribbon campaign        - against html e-mail
>>  /\  http://www.asciiribbon.org   - against proprietary attachments
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20140224/8b692796/attachment.html>


More information about the postgis-devel mailing list