[postgis-devel] Inconsistencies in text output

Martin Davis mtnclimb at gmail.com
Fri Apr 10 15:10:09 PDT 2020


+1 to correctly respecting the limit on Max DECIMAL Digits

-2 to changing the semantics to Max TOTAL Digits

Reasons:
- The current documentation and design intent is to provide the ability to
specify Max DECIMAL Digits, so changing this would be a breaking change
- It is not useful to specify the Max TOTAL Digits, since without knowing
the magnitude of the numbers this leads to unknown and possibly varying
precision of output numbers
- All text formats are designed to easily accommodate varying-length
numbers, so there is no reason to specify the length of numeric strings
(Max TOTAL Digits).
- Whereas there is a clear utility in specifying Max DECIMAL Digits

It's unfortunate if implementing true Max DECIMAL Digits requires more code
& test changes than the alternative, but the most important thing is to
provide stable and useful semantics for users going forward.

Martin



On Fri, Apr 10, 2020 at 9:33 AM <rmrodriguez at carto.com> wrote:

> PRs with the changes (there are also some changes in the cunit
> structure to ease the diff showing):
>
> - Enforcing that we output the amount of DECIMAL digits that the user
> has requested as precision:
> https://github.com/postgis/postgis/pull/554
> - Consider the precision as number of TOTAL (integer + decimal)
> digits: https://github.com/postgis/postgis/pull/555
>
> The first one does exactly what's documented in the functions
> themselves, that is if you request precision you get
> 1.111111100000000018 (before you got 1.1111111), but the diff is
> really massive (2435
> changes just in regress/). The second one is closer to the current
> behaviour (35 changes in regress/).
>
> In light of these tests, I propose changing the documentation to
> declare the precision as the number of total digits (before and after
> the point) and enforce that for all functions for 3.1+, i.e. this PR
> (https://github.com/postgis/postgis/pull/555) plus documentation
> changes.
>
> On Fri, Apr 10, 2020 at 4:22 PM <rmrodriguez at carto.com> wrote:
> >
> > Also note that it would be possible to limit the number of digits
> > (integer or decimal) instead, which could be preferable but that's a
> > breaking change to all the extension functions (since they are
> > requesting 15 decimal digits, not 15 total digits).
> >
> > On Fri, Apr 10, 2020 at 4:07 PM <rmrodriguez at carto.com> wrote:
> > >
> > > Hi everyone,
> > >
> > > Lately I've been trying to improve the performance of output functions
> > > and one of the areas where I got a massive (5x) improvement was
> > > anything that output text (ST_AsText, ST_AsGeoJSON and so on) but the
> > > implementation that I introduced for 3.1 has multiple hacks to keep
> > > the output almost exactly the same as it was for 3.0.
> > >
> > > Although most output functions have a `maxdecimaldigits` parameter
> > > that represents `maximum number of decimal digits after floating
> > > point` there are multiple cases where this isn't respected. Some
> > > examples:
> > >
> > > ```
> > > Select ST_AsText(ST_MakePoint(123456789012.1234567890123, 0), 4);
> > > POINT(123456789012.123459 0)
> > > ```
> > > The number should have 4 digits, but has 6.
> > >
> > > ```
> > > Select ST_AsText(ST_MakePoint(0, 92114.013999999996), 15);
> > > POINT(0 92114.014)
> > >
> > > Select ST_AsText(ST_MakePoint(0, 92114.013999999996), 20);
> > > POINT(0 92114.013999999995576)
> > > ```
> > >
> > > This number has a significant digit on the 12th decimal digits, but
> > > it's not shown if you request 15 decimal digits. But it is shown if
> > > you request 20 decimal digits.
> > >
> > > On the other hand, it doesn't work with ST_AsGeoJSON because it's
> > > limited internally:
> > > ```
> > > Select ST_AsGeoJSON(ST_MakePoint(0, 92114.013999999996), 20);
> > > {"type":"Point","coordinates":[0,92114.014]}
> > > ```
> > >
> > > So I want to propose some changes to unify the behaviour of all
> > > functions that output coordinates. They aren't big changes but the
> > > output of multiple regression tests will change, thus 3.0 and 3.1
> > > output will change in some cases too; but I think it's worth it as we
> > > get rid of bugs and we uniformize the output of all functions.
> > >
> > > The rules would be like this:
> > > - For (absolute) values under FP_TOLERANCE, we keep returning '0'.
> > > - For big numbers (absolute value bigger than 1E15), we keep the
> > > current behaviour and maintain the decimal digits to 8 - length of the
> > > decimal part (it's odd but I don't see the need to change it).
> > > - For smaller numbers:
> > >   - The integer part is always printed in full (no changes).
> > >   - The decimal part is always printed up to precision digits
> > > (removing trailing zeros).
> > >   - The decimal digit precision is capped between 0 and 20. This is
> > > more than enough to guarantee round trip (double -> text -> same
> > > double) and it allows us to know the max size of a double (so we can
> > > keep the static allocations) as sign + 15 integer digits + "." + 20
> > > decimal digits + "NULL".
> > >
> > > I'm going to draft a PR with the changes, but any comment is more than
> > > welcome before I push a change like this.
> > >
> > > --
> > > Raúl Marín Rodríguez
> > > carto.com
> >
> >
> >
> > --
> > Raúl Marín Rodríguez
> > carto.com
>
>
>
> --
> Raúl Marín Rodríguez
> carto.com
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20200410/61974b5b/attachment.html>


More information about the postgis-devel mailing list