[postgis-users] shp2pgsql: Handling numeric field containing scientific notation
hustvedt
hustvedt at gmail.com
Mon Jun 16 23:04:41 PDT 2008
Hello,
There is what appears to be a slight issue with the .dbf part of a
shapefile generated by ArcEditor 9.0 or 9.1 that has a value in an
integer field of 10 billion or more if that shapefile is then used with
shp2pgsql. Attempting to load the resulting sql file results in a
casting error. One possible solution is to cast the field to numeric in
the sql output, before pgsql attempts to store it in a int8 field.
How common is this issue?
If a shapefile has an integer column with a value of 10 billion, some
versions of ERSI ArcEditor will store that into the .dbf as "1.00e+010".
As far as I can tell, that is an invalid string to store in a dBase III
file, since the only characters allowed in a Numeric field are
{+-.0123456789}.
shp2pgsql doesn't do any coversions on what is stored in the .dbf file,
so that gets put directly into the .sql file that is generated. It does
set the types of the fields in the sql file, of course, so that affects
the castings that pgsql is going to do. "1.00e+010" is a valid numeric
type, and that can be casted to an int8, with any necessary rounding
done. It is not a valid int8 value, even if it will fit into an int8
field. shp2pgsql deduces correctly that an int8 will be needed to store
the resulting number, but I think that is only based on the field length
from the header of the .dbf file.
One solution is that for any field that has a type of an integer, to add
an explicit cast to numeric in the sql file, then psql will be able to
handle this case. This is a simple solution, since it is just adding a
"::numeric" after any integer field value. Patching shp2pgsql and adding
a 'printf("::numeric);' is the easiest way to do that I can see.
Here is what the current code will tell psql to do, and the resulting error:
# select '1.00e+010'::integer;
ERROR: invalid input syntax for integer: "1.00e+010"
However, if that value is cast to ::numeric first, then it will work as
an int8:
# select '1.00e+010'::numeric;
numeric
-------------
10000000000
(1 row)
# select '1.00e+010'::numeric::int8;
int8
-------------
10000000000
(1 row)
The command line we use is "shp2pgsql -g the_geom -W UTF8 -s 4326 -c -N
skip SAPEFILE TABLE > SQL"
Running this sql code that shp2pgsql created generates the error,
'PGError: ERROR: invalid input syntax for integer: "1.00e+010" ':
CREATE TABLE "data_staging"."shp_121336819166268_20867" (gid serial
PRIMARY KEY,
"flag" int2,
"v1" varchar(2),
"v2" varchar(1),
"v3" varchar(3),
"v4" varchar(3),
"v5" varchar(2),
"v6" varchar(5),
"v6a" varchar(16),
"v7" varchar(50),
"v7a" int4,
"v8" varchar(45),
"v9" varchar(5),
"v10" varchar(6),
"v11" int8,
"v12" varchar(32),
"v13" varchar(2),
"v17" varchar(50),
"v18" varchar(50),
"address" varchar(45),
"city" varchar(32),
"state" varchar(2),
"zipcode" varchar(5),
"zip_p4" varchar(4),
"v25" varchar(9),
"zip_p4_a" varchar(10),
"v33" varchar(10),
"v34" int2,
"v38" int2,
"v39" int2,
"v40" int2,
"v41" int2,
"v42" int2,
"v43" int2,
"v44" int2,
"v45" int2,
"v46" int2,
"v47" int2,
"v52" int2,
"v53" int2,
"v54" int2,
"v55" int2,
"v56" int2,
"v57" int2,
"v58" int2,
"v59" int2,
"v60" int2,
"v61" int2,
"v62" int2,
"v62f" int2,
"v63" int2,
"v63f" int2,
"v64" int2,
"v64f" int2,
"v65" int4,
"v65f" int2,
"v66" int4,
"v66f" int2,
"v67" int4,
"v67f" int2,
"v68" int4,
"v68f" int2,
"v69" int4,
"v69f" int2,
"v70" int4,
"v70f" int2,
"v71" int4,
"v71f" int2,
"v72" int4,
"v72f" int2,
"v73" int4,
"v73f" int2,
"v74" int4,
"v74f" int2,
"v75" int4,
"v75f" int2,
"v76" int4,
"v76f" int2,
"v77" int4,
"v77f" int2,
"v78" int4,
"v78f" int2,
"v79" int4,
"v79f" int2,
"v80" int4,
"v80f" int2,
"v81" int4,
"v81f" int2,
"v82" int8,
"v82f" int2,
"v83" int2,
"v84" int8,
"v84f" int2,
"v85" int4,
"cp_pct" float8,
"filter__" int2,
"sizecat" float8,
"joyce_st" float8,
"longitude" numeric,
"latitude" numeric,
"georesult" varchar(11),
"georstclp" varchar(2),
"fid_1" numeric,
"correction" varchar(50),
"long_adj" numeric,
"lati_adj" numeric);
SELECT
AddGeometryColumn('data_staging','shp_121336819166268_20867','the_geom','4326','POINT',2);
INSERT INTO "data_staging"."shp_121336819166268_20867"
("flag","v1","v2","v3","v4","v5","v6","v6a","v7","v7a","v8","v9","v10","v11","v12","v13","v17","v18","address","city","state","zipcode","zip_p4","v25","zip_p4_a","v33","v34","v38","v39","v40","v41","v42","v43","v44","v45","v46","v47","v52","v53","v54","v55","v56","v57","v58","v59","v60","v61","v62","v62f","v63","v63f","v64","v64f","v65","v65f","v66","v66f","v67","v67f","v68","v68f","v69","v69f","v70","v70f","v71","v71f","v72","v72f","v73","v73f","v74","v74f","v75","v75f","v76","v76f","v77","v77f","v78","v78f","v79","v79f","v80","v80f","v81","v81f","v82","v82f","v83","v84","v84f","v85","cp_pct","filter__","sizecat","joyce_st","longitude","latitude","georesult","georstclp","fid_1","correction","long_adj","lati_adj",the_geom)
VALUES ('0','IL','2','024','001','02','60100','1420240010260100','ALBION
POLICE
DEPT','88','ALBION','17047',NULL,'1933','EDWARDS','03',NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,'9999','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','3','0','3','0','3','999999','9','3','0','0','0','0','0','0','0','0','0','0','0','3','0','0','0','2','3','0','3','0','3','88888','8','88888','8','88888','8','88888','8','88888','8','1.00e+010','9','0','1.00e+010','9','0','0.00','1','1.00','0.00','-88.056282000','38.377029000','N','N','1856.000000',NULL,'-8.80562820000e+001','38.377029000','SRID=4326;01010000009F56D11F9A0356C061527C7C42304340');
--
Anders Hustvedt
http://mapbuzz.com
More information about the postgis-users
mailing list