[GRASS-dev] Re: [GRASS GIS] #1161: g.region and r.info decimel issue when using grass python libs

Sun Oct 3 16:15:08 EDT 2010

#1161: g.region and r.info decimel issue when using grass python libs
-------------------------+--------------------------------------------------
  Reporter:  isaacullah  |       Owner:  grass-dev@…              
      Type:  defect      |      Status:  closed                   
  Priority:  normal      |   Milestone:  6.4.1                    
 Component:  Python      |     Version:  6.4.0                    
Resolution:  invalid     |    Keywords:                           
  Platform:  All         |         Cpu:  All                      
-------------------------+--------------------------------------------------

Comment(by glynn):

 Replying to [comment:5 cmbarton]:

 > Thanks for the explanation. The difference between the g.region
 representation to the shell and the information in the python dictionary
 does seem a result of lossy conversion (e.g., 4.9998993900000004).

 How did you determine what is really in the Python dictionary? Bear in
 mind that displaying the value in decimal typically performs another lossy
 conversion, and Python's decimal formatting behaviour is known to be
 suboptimal (i.e. it will typically use too many decimal places). E.g.:
 {{{
 > x = 5.1
 > x
 5.0999999999999996
 > str(x)
 '5.1'
 > repr(x)
 '5.0999999999999996'
 }}}

 Note that the longer version is probably closer to the exact decimal
 representation of the binary floating-point value, while the shorter
 version is probably the shortest decimal value which would convert to the
 given binary floating-point value.

 Or, more relevant to the original report:
 {{{
 > north = 4293588.60267
 > north
 4293588.6026699999
 > str(north)
 '4293588.60267'
 > repr(north)
 '4293588.6026699999'
 > '%f' % north
 '4293588.602670'
 > '%.5f' % north
 '4293588.60267'
 }}}

 > Maybe there is no solution to this, but it seems that g.region and
 grass.region() *ought* to be returning exactly the same values. At least,
 without knowing what you have explained, the normal assumption is that
 these would match.

 g.region returns strings; grass.script.region() returns numbers
 (specifically, Python "float"s, which on any modern system will be IEEE
 double-precision floating-point values). If you're going to be using those
 values for anything, they will get converted to floating-point at some
 point, and it's a safe bet that you will get exactly the same result
 regardless of whatever performs the conversion.

 > If we have Java (or perhaps our Python scripts) round grass.region()
 values to the maximum number of significant digits output to the shell, do
 you think we could be assured that the values would match exactly? If so,
 what about making this an optional argument (i.e, match g.region shell
 output) for grass.region(), grass.raster_info(), and other Python library
 functions that return region boundaries?

 Given that coordinates are limited by the circumference of the earth, both
 %.15g and %.8f should be converted without loss (i.e. adjacent values
 should have distinct floating-point representations), so if you format as
 decimal with the correct number of digits, you should get the original
 value. This has to be done when you convert to decimal, though; you can't
 "tag" floating-point values with formatting options.

 If you're comparing floating-point values, you should be comparing to
 within some tolerance. I'd suggest ten microns to allow for %.6f (which is
 the default for %f if no precision is given).

 On x86, it's impractical to perform exact comparisons for most floating
 point values, as the CPU uses 80 bits ("long double") internally but only
 64 bits if the value is stored in a "double", which means that "x = y; if
 (x == y) ..." can be false.

-- 
Ticket URL: <http://trac.osgeo.org/grass/ticket/1161#comment:6>
GRASS GIS <http://grass.osgeo.org>