[postgis-users] shp2pgsql -I option

Paul Ramsey pramsey at refractions.net
Wed Apr 5 22:44:49 PDT 2006


Brent,

No, checking the actual output of the command, it creates the spatial  
index before loading the data... however, I am also using the -D  
flag, so the whole batch is committed at once, which hopefully means  
the index is batch created too, so no harm done...

shp2pgsql -D -I -i dra_tsa_clip.shp dra_tsa_clip | head -n 100 | less

BEGIN;
CREATE TABLE "dra_tsa_clip" (gid serial PRIMARY KEY,
"conf_id" int4,
"source" varchar(4),
"srcstate" varchar(19),
"resstate" varchar(10),
"road_id" int4,
"rd_seg_id" int4,
"region_id" int4,
"grid_code" int4);
SELECT AddGeometryColumn 
('','dra_tsa_clip','the_geom','-1','MULTILINESTRING',2);
CREATE INDEX "dra_tsa_clip_the_geom_gist" ON "dra_tsa_clip" using  
gist ("the_geom" gist_geome
try_ops);
COPY  
"dra_tsa_clip" ("conf_id","source","srcstate","resstate","road_id","rd_s 
eg_id","region_i
d","grid_code",the_geom) FROM stdin;

Unfortunately, when run without the -D flag, it still does the same  
thing... this could result in some quite slow data loads, for people  
who think that they are actually saving themselves time by using the - 
I flag :)

shp2pgsql -I -i dra_tsa_clip.shp dra_tsa_clip | head -n 100 | less

BEGIN;
CREATE TABLE "dra_tsa_clip" (gid serial PRIMARY KEY,
"conf_id" int4,
"source" varchar(4),
"srcstate" varchar(19),
"resstate" varchar(10),
"road_id" int4,
"rd_seg_id" int4,
"region_id" int4,
"grid_code" int4);
SELECT AddGeometryColumn 
('','dra_tsa_clip','the_geom','-1','MULTILINESTRING',2);
CREATE INDEX "dra_tsa_clip_the_geom_gist" ON "dra_tsa_clip" using  
gist ("the_geom" gist_geome
try_ops);


On 5-Apr-06, at 10:33 PM, Brent Wood wrote:

>
>
> --- Paul Ramsey <pramsey at refractions.net> wrote:
>
>> I just ran using the -I option for the first time, and to my
>> surprise, the utility seemed to create the index *before* loading the
>> data...
>
> Hi Paul,
>
> Thanks for the pointer.
>
> It only creates the primary key, which, being a unique index,  
> should be in
> place before any data is loaded, should it not?
>
> I can see why this is not really necessary if you can otherwise  
> guarantee the
> uniqueness, but I think it is good practice to ensure any indices  
> created to
> ensure data integrity (primary & foreign keys) are in place before  
> any data is
> inserted.
>
> Indices which are only created for performance reasons may well be  
> generated
> after the data is loaded, but I'm less comfortable about keys.
>
> Which means I (for one) am not unhappy about the perfromance hit  
> from creating
> key indices before inserting data :-)
>
> Cheers,
>
>   Brent
>
>>
>> Bluejay:~/Desktop pramsey$ shp2pgsql -D -i -I dra_tsa_clip.shp
>> dra_tsa_clip | psql
>> Shapefile type: Arc
>> Postgis type: MULTILINESTRING[2]
>> BEGIN
>> NOTICE:  CREATE TABLE will create implicit sequence
>> "dra_tsa_clip_gid_seq" for serial column "dra_tsa_clip.gid"
>> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
>> "dra_tsa_clip_pkey" for table "dra_tsa_clip"
>> CREATE TABLE
>>                           addgeometrycolumn
>> --------------------------------------------------------------------
>> public.dra_tsa_clip.the_geom SRID:-1 TYPE:MULTILINESTRING DIMS:2
>> (1 row)
>>
>> <... no pause here ...>
>> CREATE INDEX
>> <... long pause here while data loads ...>
>> COMMIT
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list