[postgis-devel] Inconsistencies in text output

rmrodriguez at carto.com rmrodriguez at carto.com
Fri Apr 17 09:15:26 PDT 2020


On Fri, Apr 17, 2020 at 5:36 PM Martin Davis <mtnclimb at gmail.com> wrote:
> Took a quick look at the first PR (with the large number of changes).  There seem to be a lot of changes which are changing the name of the test function, but which are not changing the expected output).

Yes. I'm sorry but to make my life easier I replaced the string
comparison macro to always use the one that outputs both strings when
they are different, so there are many changes under cunit that didn't
change output. I'll try to break commits down to something that makes
more sense.

> Is that right?  If so, does that mean this change is not as large as mentioned above?

No. I only mentioned regress tests (those under regress/) in my
comment above. So the change is as large as impactful as mentioned.


On Fri, Apr 17, 2020 at 5:43 PM Paul Ramsey <pramsey at cleverelephant.ca> wrote:
>
> It's not so much that I'm not worried, as that I feel like it's a wide enough change to be a little beyond my comprehension... I know we recently changed default JSON output precision to use fewer digits to save space on the wire. The idea being that the primary use case for JSON was read-only shipping to web clients. Would this change push the number of JSON digits in the default back up?

No but yes. This change will enforce that you get as many decimal
digits as you request (dropping trailing zeros). Since ST_AsGeoJSON
has a default of 9, it will get 9. But other functions have 15, so you
will get 15 decimal digits.
Why do I say yes, then? Because before the amount of digits output was
a mix between the number of digits pre point, and the amount of digits
post point (aka decimal) so in most of the times you weren't getting
the amount of requested decimal digits.

> The example shows that it's possible that harmonizing all formats might be counter-productive to previously agreed optimizations that made sense on a use-case basis.

It's harmonizing the behaviour, but the defaults can still be
different for different functions. One of the differences between
formats was that some had a different limit for how many digits to
output, even if you manually requested more, for no good reason. For
example, you couldn't get 20 decimal digits on ST_AsGeoJSON even if
you wanted as many to ensure you get full round trip capabilities.


> This is a change that alters the outputs from what was previously expected. Are there usages you can envision where the change might surprise/break the end user application? The fact that the new behaviour is better/more consistent will not make any existing user happier if their app breaks. Is it possible we'll end up with those cases?

I don't expect to break anything apart from integration tests but, and
this is important, I expect the size of the output to grow by quite a
bit as more digits are presented now that weren't before. The problem
is that I don't think anybody really needs those amounts of digits, at
least not enough to consider it the default.

On Fri, Apr 17, 2020 at 5:46 PM Martin Davis <mtnclimb at gmail.com> wrote:

> One thing that occurred to me is that the apparent semantic of the default output precision is not that useful.  Right now, at least according to the manual, the default output precision is explicitly specified as 15 decimal digits.  This is not useful, since it takes no account of the actual significant digits in the number being output.
>
> More useful IMO is that the default (if the precision is not provided) is to output a number with say 15 *significant* digits (i.e. including all digits in the number).  This means of course that the number of *decimal places* can vary between numbers, depending on their magnitude.  But if the user wanted a fixed number of decimal places they could have specified it explicitly.
>
> This means that the default does not drop (or add) any digits.  Also, I believe this matches the semantics of most programming language floating point output routines.
>
> And is it the case that this is how the routines work right now?


Not exactly. What we used to have (and more or less still do) is a
fixed max amount of digits (decimal or not), and the requested number
of decimal digits. Depending on whether we have enough space or not
(which depends on how big the number is) we would reduce the number of
decimal digits dynamically.

You can see the old implementation here:
https://github.com/postgis/postgis/blob/72fc4cd802b0b4a69c2e0e6a0ac37f01315ef83c/liblwgeom/lwprint.c#L502

The other proposals, that limits the amount of total digits with the
precision, is much closer to this behaviour (and basically changes the
things where we had extra artificial limits, like ST_AsGeoJSON,
ST_AsKML and so on).

On Fri, Apr 17, 2020 at 5:57 PM Martin Davis <mtnclimb at gmail.com> wrote:

> I don't understand this observation.  What number of decimal digits did you request?  There are 18 decimals in the first number, which seems like a lot, unless you explicitly requested that?
>
> If you are using the default output precision, then the second number makes it look like the code is outputting "full FP precision", as opposed to the stated 15 as per the manual.  This is reasonable and desirable IMO.

That wasn't a good example. I'll take an example from the changes in the PR:
- Before: POINT(128.10466 130.94133)
- After: POINT(128.104659999999996 130.941329999999994)

Since ST_AsText requests 15 decimal digits you now get 10 extra chars
for the same number.

A similar example:
`1656318.45` -> `1656318.449999999953434` where you would now get an
extra 13 decimal digits.


-- 
Raúl Marín Rodríguez
carto.com


More information about the postgis-devel mailing list