[Gdal-dev] OGR returns wrong floating values for shapefiles (and integer as real, in error)

Tue Oct 24 05:43:20 EDT 2006

On Tue, 24 Oct 2006, Maciej Sieczka wrote:

> Hi,
> 
> ogrinfo (and other apps based on OGR, eg. OpenEV, QGIS) returns wrong
> floating point values querying my shapefiles. Eg.:
> 
> $ ogrinfo -al -q streams.shp
> 
> Layer name: streams
> OGRFeature(streams):523
>   CAT (Real) =         484
>   LCAT (Real) =          73
>   Z (Real) =      101.583309999999997
>   Z_BREACH (Real) =      101.583309999999997
>   Z_BREACH1 (Real) =      100.583309999999997
>   LENGTH (Real) =        2.036246000000000
>   LINESTRING (598549.144524969975464 5677309.376777020283043,598550.0
> 5677311.224603090435266)
> 
> 
> After opening the dbf in oocalc 2.03, I can see the values should
> recpectively be:
> 
> CAT	  484
> LCAT	  73
> Z	  101.58331
> Z_BREACH  101.58331
> Z_BREACH1 100.58331
> LENGTH	  2.036246
> 
> Why the spurious "09999999997" in case of Z, Z_BREACH, Z_BREACH1 in
> OGR? Note that, interestingly, LENGTH is OK though.

Not spurious, just two different decimal "views" of the same underlying 
floating-point value, see e.g. David Goldberg (1991), What Every Computer 
Scientist Should Know About Floating-Point Arithmetic, ACM Computing 
Surveys, 23/1, 548, also available via 
http://docs.sun.com/source/806-3568/ncg_goldberg.html.

> 
> Morevover, CAT and LCAT are not Real numbers. They are integer. Why
> reported as real?

That will depend on the functions reading the underlying DBF, I see both 
all reals or a mixture in several shapefiles. It may be that when 
"integer" precision is non-zero, it may be being taken as real?

In R:

library(rgdal)
ogrInfo("streams", "streams")$iteminfo

says:

$name
[1] "CAT"       "LCAT"      "Z"         "Z_BREACH"  "Z_BREACH1" "LENGTH"   

$precision
[1] 2 2 2 2 2 2

$length
[1] 11 11 24 24 24 24

> t1 <- readOGR("streams", "streams")
OGR data source with driver: ESRI Shapefile 
Source: "streams", layer: "streams"
with  1  rows and  6  columns
> str(as(t1, "data.frame"))
'data.frame':   1 obs. of  6 variables:
 $ CAT      : num 484
 $ LCAT     : num 73
 $ Z        : num 102
 $ Z_BREACH : num 102
 $ Z_BREACH1: num 101
 $ LENGTH   : num 2.04

R will not print 24 digits, but for 22 digits:

> print(t2, digits=22)
  CAT LCAT                       Z                Z_BREACH
1 484   73 101.5833099999999973306 101.5833099999999973306
                Z_BREACH1                  LENGTH
1 100.5833099999999973306 2.036245999999999778396

so your first issue is simply that floating point numbers have fuzz, which 
the application may shave or not.

Have a look for example at http://shapelib.maptools.org/dbf_api.html, 
which suggests that integers should have zero-precision.

Roger

> 
> Is any of the issues fixed in newer GDAL? I've been using 1.3.2 CVS
> 2006-07-24.
> 
> The shapefile is attached if needed.
> 
> Maciek
> 
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no