[GRASS5] potential problems in d.vect.thematic when using a column of type real in postgresql

Daniel Calvelo Aros dcalvelo at minag.gob.pe
Thu Sep 15 13:23:57 EDT 2005


From: Moritz Lennert <mlennert at club.worldonline.be>
Sent: Wed, 14 Sep 2005 20:29:03 +0200
> 
> Hello,
> 
> I just discovered a slight problem in d.vect.thematic, but I'm not
> sure whether d.vect.thematic should/could be altered for this, or
> whether a simple warning should be inserted in the man page.
> 
> When using a column that is defined as 'real' in postgresql, the
> results of the comparisons are more or less arbitrary, since, as the
> postgresql manual puts it:
> 
> "The data types real and double precision are inexact,
> variable-precision numeric types", meaning that "comparing two
> floating-point values for equality may or may not work as expected".

That's a general problem with floating-point arithmetic. 

The common viable solution is to define "equality" as "very small 
difference", e.g. instead of a=b compare abs(a-b) to something small (say 1e-
10) or a/b to something close to one. 

If the values are stored as floating point but are supposed to be fixed-
point (e.g. currency) then 1) you may run into heavy trouble and 2) you may 
compare int(a*r) to int(b*r) where r is 10^precision. 

In GRASS, there are currently no functions in expressions for every DBMI 
driver, thus the only possible rewrite for 'a=b' would be:

'a/b < 1+1e-10 and b/a < 1+1e-10'

which might run into division by zero errors and so on.

> Now there seem to be two possible solutions for that in postgresql:
> 
> 1) quote the constant you are comparing your real value with, i.e.
> (within psql):
> 
> grass=> select * from espvie where e_=81.58;
>  grands_sec | pop | icm | e_ | cat | e2
> - ------------+-----+-----+----+-----+----
> (0 lignes)
> 
> grass=> select * from espvie where e_='81.58';
>  grands_sec |  pop  |   icm    |  e_   | cat |  e2
> - ------------+-------+----------+-------+-----+-------
>  23103      | 13600 | 0.722814 | 81.58 |  51 | 81.58
> (1 ligne)
> 
> Using this solution would mean changing the lines constructing the
> sqlwhere variable from
> 
> sqlwhere="$GIS_OPT_column $mincomparison $rangemin and
> $GIS_OPT_column<=$rangemax"
> 
> to
> 
> sqlwhere="$GIS_OPT_column $mincomparison '$rangemin' and
> $GIS_OPT_column<='$rangemax'"
> 
> But I don't know what the effect of this would be on other drivers
> (such as dbf). And don't have the time to test this just now.

Try it. DBF driver can be adjusted to deal with that, since expression 
calculations perform type conversion. I'd say it works currently. The 
problem is that there is no guarantee that C's conversion will produce the 
proper value giving the expected result. We might try to be extremely smart 
and detect an equality testing for reals and then use an approximate 
equality. Seems dubious, but some packages actually do that, knowinf what 
the machine epsilon is.

> 2) use the 'numeric' data type:
> 
> grass=> \d espvie
>                Table «public.espvie»
>   Colonne   |         Type         | Modificateurs
> - ------------+----------------------+---------------
>  grands_sec | character varying(8) |
>  pop        | real                 |
>  icm        | real                 |
>  e_         | real                 |
>  cat        | integer              |
>  e2         | numeric              |
> 
> grass=> select * from espvie where e2=81.58;
>  grands_sec |  pop  |   icm    |  e_   | cat |  e2
> - ------------+-------+----------+-------+-----+-------
>  23103      | 13600 | 0.722814 | 81.58 |  51 | 81.58
> (1 ligne)
> 
> This solution would imply a simple warning in the man page.

I´d say a warning is in order. In the sql notes, preferably.

> This problem is obviously wider than just d.vect.thematic. d.vect is
> affected by it as well, and I imagine any command using a where
> clause...
> 
> Any suggestions ?
> 
> Moritz





More information about the grass-dev mailing list