[postgis-users] shp2pgsql: Handling numeric field containing scientific notation

hustvedt hustvedt at gmail.com
Mon Jun 16 23:04:41 PDT 2008


Hello,

There is what appears to be a slight issue with the .dbf part of a 
shapefile generated by ArcEditor 9.0 or 9.1 that has a value in an 
integer field of 10 billion or more if that shapefile is then used with 
shp2pgsql. Attempting to load the resulting sql file results in a 
casting error. One possible solution is to cast the field to numeric in 
the sql output, before pgsql attempts to store it in a int8 field.

How common is this issue?



If a shapefile has an integer column with a value of 10 billion, some 
versions of ERSI ArcEditor will store that into the .dbf as "1.00e+010". 
As far as I can tell, that is an invalid string to store in a dBase III 
file, since the only characters allowed in a Numeric field are 
{+-.0123456789}.

shp2pgsql doesn't do any coversions on what is stored in the .dbf file, 
so that gets put directly into the .sql file that is generated. It does 
set the types of the fields in the sql file, of course, so that affects 
the castings that pgsql is going to do. "1.00e+010" is a valid numeric 
type, and that can be casted to an int8, with any necessary rounding 
done. It is not a valid int8 value, even if it will fit into an int8 
field. shp2pgsql deduces correctly that an int8 will be needed to store 
the resulting number, but I think that is only based on the field length 
from the header of the .dbf file.


One solution is that for any field that has a type of an integer, to add 
an explicit cast to numeric in the sql file, then psql will be able to 
handle this case. This is a simple solution, since it is just adding a 
"::numeric" after any integer field value. Patching shp2pgsql and adding 
a 'printf("::numeric);' is the easiest way to do that I can see.


Here is what the current code will tell psql to do, and the resulting error:
# select '1.00e+010'::integer;
ERROR:  invalid input syntax for integer: "1.00e+010"



However, if that value is cast to ::numeric first, then it will work as 
an int8:
# select '1.00e+010'::numeric;
   numeric  
-------------
 10000000000
(1 row)

# select '1.00e+010'::numeric::int8;
    int8    
-------------
 10000000000
(1 row)


The command line we use is "shp2pgsql -g the_geom -W UTF8 -s 4326 -c -N 
skip SAPEFILE TABLE > SQL"
Running this sql code that shp2pgsql created generates the error, 
'PGError: ERROR:  invalid input syntax for integer: "1.00e+010" ':


CREATE TABLE "data_staging"."shp_121336819166268_20867" (gid serial 
PRIMARY KEY,
"flag" int2,
"v1" varchar(2),
"v2" varchar(1),
"v3" varchar(3),
"v4" varchar(3),
"v5" varchar(2),
"v6" varchar(5),
"v6a" varchar(16),  
"v7" varchar(50),
"v7a" int4,
"v8" varchar(45),
"v9" varchar(5),
"v10" varchar(6),
"v11" int8,
"v12" varchar(32),
"v13" varchar(2),
"v17" varchar(50),
"v18" varchar(50),
"address" varchar(45),
"city" varchar(32),
"state" varchar(2),
"zipcode" varchar(5),
"zip_p4" varchar(4),
"v25" varchar(9),
"zip_p4_a" varchar(10),
"v33" varchar(10),
"v34" int2,
"v38" int2,
"v39" int2,
"v40" int2,
"v41" int2,
"v42" int2,
"v43" int2,
"v44" int2,
"v45" int2,
"v46" int2,
"v47" int2,
"v52" int2,
"v53" int2,
"v54" int2,
"v55" int2,
"v56" int2,
"v57" int2,
"v58" int2,
"v59" int2,
"v60" int2,
"v61" int2,
"v62" int2,
"v62f" int2,
"v63" int2,
"v63f" int2,
"v64" int2,
"v64f" int2,
"v65" int4,
"v65f" int2,
"v66" int4,
"v66f" int2,
"v67" int4,
"v67f" int2,
"v68" int4,
"v68f" int2,
"v69" int4,
"v69f" int2,
"v70" int4,
"v70f" int2,
"v71" int4,
"v71f" int2,
"v72" int4,
"v72f" int2,
"v73" int4,
"v73f" int2,
"v74" int4,
"v74f" int2,
"v75" int4,
"v75f" int2,
"v76" int4,
"v76f" int2,
"v77" int4,
"v77f" int2,
"v78" int4,
"v78f" int2,
"v79" int4,
"v79f" int2,
"v80" int4,
"v80f" int2,
"v81" int4,
"v81f" int2,
"v82" int8,
"v82f" int2,
"v83" int2,
"v84" int8,
"v84f" int2,
"v85" int4,
"cp_pct" float8,
"filter__" int2,
"sizecat" float8,
"joyce_st" float8,
"longitude" numeric,
"latitude" numeric,
"georesult" varchar(11),
"georstclp" varchar(2),
"fid_1" numeric,
"correction" varchar(50),
"long_adj" numeric,
"lati_adj" numeric);
SELECT 
AddGeometryColumn('data_staging','shp_121336819166268_20867','the_geom','4326','POINT',2);

INSERT INTO "data_staging"."shp_121336819166268_20867" 
("flag","v1","v2","v3","v4","v5","v6","v6a","v7","v7a","v8","v9","v10","v11","v12","v13","v17","v18","address","city","state","zipcode","zip_p4","v25","zip_p4_a","v33","v34","v38","v39","v40","v41","v42","v43","v44","v45","v46","v47","v52","v53","v54","v55","v56","v57","v58","v59","v60","v61","v62","v62f","v63","v63f","v64","v64f","v65","v65f","v66","v66f","v67","v67f","v68","v68f","v69","v69f","v70","v70f","v71","v71f","v72","v72f","v73","v73f","v74","v74f","v75","v75f","v76","v76f","v77","v77f","v78","v78f","v79","v79f","v80","v80f","v81","v81f","v82","v82f","v83","v84","v84f","v85","cp_pct","filter__","sizecat","joyce_st","longitude","latitude","georesult","georstclp","fid_1","correction","long_adj","lati_adj",the_geom) 

VALUES ('0','IL','2','024','001','02','60100','1420240010260100','ALBION 
POLICE 
DEPT','88','ALBION','17047',NULL,'1933','EDWARDS','03',NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,'9999','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','3','0','3','0','3','999999','9','3','0','0','0','0','0','0','0','0','0','0','0','3','0','0','0','2','3','0','3','0','3','88888','8','88888','8','88888','8','88888','8','88888','8','1.00e+010','9','0','1.00e+010','9','0','0.00','1','1.00','0.00','-88.056282000','38.377029000','N','N','1856.000000',NULL,'-8.80562820000e+001','38.377029000','SRID=4326;01010000009F56D11F9A0356C061527C7C42304340');

--
Anders Hustvedt
http://mapbuzz.com



More information about the postgis-users mailing list