[postgis-users] shp2pgsql -I option

Gregory S. Williamson gsw at globexplorer.com
Thu Apr 6 01:09:41 PDT 2006


FWIW, I would strongly endorse an option to create the spatial index last, if at all (sometimes more than one shape file is getting loaded, would be my reasoning to the "not at all" case). Not that it is overwhelming to edit the SQL file and move the index and constraint creation.

My initial timings seem to show that it is a lot faster to load data, create the unique constraint and the geometry index than it is to load data with both indices existant. I can try to get some numbers if anyone cares enough.

I can see an argument for creating the unique key constraint first, although I generally defer it (partly because the data being loaded is usually under my control and more sanitized).

Greg Williamson
DBA
GlobeXplorer LLC


-----Original Message-----
From:	postgis-users-bounces at postgis.refractions.net on behalf of Paul Ramsey
Sent:	Wed 4/5/2006 10:44 PM
To:	PostGIS Users Discussion
Cc:	
Subject:	Re: [postgis-users] shp2pgsql -I option

Brent,

No, checking the actual output of the command, it creates the spatial  
index before loading the data... however, I am also using the -D  
flag, so the whole batch is committed at once, which hopefully means  
the index is batch created too, so no harm done...

shp2pgsql -D -I -i dra_tsa_clip.shp dra_tsa_clip | head -n 100 | less

BEGIN;
CREATE TABLE "dra_tsa_clip" (gid serial PRIMARY KEY,
"conf_id" int4,
"source" varchar(4),
"srcstate" varchar(19),
"resstate" varchar(10),
"road_id" int4,
"rd_seg_id" int4,
"region_id" int4,
"grid_code" int4);
SELECT AddGeometryColumn 
('','dra_tsa_clip','the_geom','-1','MULTILINESTRING',2);
CREATE INDEX "dra_tsa_clip_the_geom_gist" ON "dra_tsa_clip" using  
gist ("the_geom" gist_geome
try_ops);
COPY  
"dra_tsa_clip" ("conf_id","source","srcstate","resstate","road_id","rd_s 
eg_id","region_i
d","grid_code",the_geom) FROM stdin;

Unfortunately, when run without the -D flag, it still does the same  
thing... this could result in some quite slow data loads, for people  
who think that they are actually saving themselves time by using the - 
I flag :)

shp2pgsql -I -i dra_tsa_clip.shp dra_tsa_clip | head -n 100 | less

BEGIN;
CREATE TABLE "dra_tsa_clip" (gid serial PRIMARY KEY,
"conf_id" int4,
"source" varchar(4),
"srcstate" varchar(19),
"resstate" varchar(10),
"road_id" int4,
"rd_seg_id" int4,
"region_id" int4,
"grid_code" int4);
SELECT AddGeometryColumn 
('','dra_tsa_clip','the_geom','-1','MULTILINESTRING',2);
CREATE INDEX "dra_tsa_clip_the_geom_gist" ON "dra_tsa_clip" using  
gist ("the_geom" gist_geome
try_ops);


On 5-Apr-06, at 10:33 PM, Brent Wood wrote:

>
>
> --- Paul Ramsey <pramsey at refractions.net> wrote:
>
>> I just ran using the -I option for the first time, and to my
>> surprise, the utility seemed to create the index *before* loading the
>> data...
>
> Hi Paul,
>
> Thanks for the pointer.
>
> It only creates the primary key, which, being a unique index,  
> should be in
> place before any data is loaded, should it not?
>
> I can see why this is not really necessary if you can otherwise  
> guarantee the
> uniqueness, but I think it is good practice to ensure any indices  
> created to
> ensure data integrity (primary & foreign keys) are in place before  
> any data is
> inserted.
>
> Indices which are only created for performance reasons may well be  
> generated
> after the data is loaded, but I'm less comfortable about keys.
>
> Which means I (for one) am not unhappy about the perfromance hit  
> from creating
> key indices before inserting data :-)
>
> Cheers,
>
>   Brent
>
>>
>> Bluejay:~/Desktop pramsey$ shp2pgsql -D -i -I dra_tsa_clip.shp
>> dra_tsa_clip | psql
>> Shapefile type: Arc
>> Postgis type: MULTILINESTRING[2]
>> BEGIN
>> NOTICE:  CREATE TABLE will create implicit sequence
>> "dra_tsa_clip_gid_seq" for serial column "dra_tsa_clip.gid"
>> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
>> "dra_tsa_clip_pkey" for table "dra_tsa_clip"
>> CREATE TABLE
>>                           addgeometrycolumn
>> --------------------------------------------------------------------
>> public.dra_tsa_clip.the_geom SRID:-1 TYPE:MULTILINESTRING DIMS:2
>> (1 row)
>>
>> <... no pause here ...>
>> CREATE INDEX
>> <... long pause here while data loads ...>
>> COMMIT
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

!DSPAM:4434ab09223151422049324!







More information about the postgis-users mailing list