[Gdal-dev] Content length field mismatch in shapefiles
Roger Bivand
Roger.Bivand at nhh.no
Sat Apr 29 14:23:30 EDT 2006
I have a question, about shapefiles - specifically Geolytics seem to
provide US subscribers with shapefiles with the content length of the
*.shx 6 decimal above the *.shp content length, and 4 decimal above what
it should be (after checking by creating a new *.shp and *.shx in
shapelib), which throws shapelib (usually on the final geometry). This is
generalising from a small sample, but the user who contacted me reported
needing to use special treatment on the Geolytics files found at his
university that he tried to read using shapelib-based software.
Has anyone ever heard of this? The files will read into ArcGIS, and in R
the shapefiles package, read.shp() and read.shx() only use native R binary
reads can read them sequentially, because they don't try to do random
access on the *.shp. ArcGIS seems to spend more time than usual for files
of that complexity, but gets round the problem, v.in.ogr in GRASS says
that no geometry is available for one DBF record, but processes all but
the last geometry.
The Geolytics problem seems to be that the length values in the *.shx file
don't agree with the *.shp. ESRI say "The content length stored in the
index record is the same as the value stored in the main file record
header", but for a sample file:
> library(shapefiles)
> geolytics <- read.shp("jw_wacounty.shp")
> geolytics_content.length <- sapply(geolytics$shp, function(x)
x$content.length)
> geolytics_content.length
[1] 382 542 726 3574 750 846 398 806 1550 1438 878 646 902 590 960
[16] 710 2190 534 2374 582 982 854 438 2446 750 158 2390 414 1422 430
[31] 998 342 1782 1094 254 574 1096 1182 1558
> geoshx <- read.shx("jw_wacounty.shx")
> geoshx$index[,2]
[1] 388 548 732 3580 756 852 404 812 1556 1444 884 652 908 596 966
[16] 716 2196 540 2380 588 988 860 444 2452 756 164 2396 420 1428 436
[31] 1004 348 1788 1100 260 580 1102 1188 1564
I was sent the sample file by a user unable to read it into R using the
shapelib-based packages, but because it is Geolytics, I can't post it. I
can ask for permission to email a copy.
Roger
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the Gdal-dev
mailing list