[postgis-devel] Loader cleanup

christian.graefe at web.de christian.graefe at web.de
Wed Apr 6 03:57:14 PDT 2005


Hi strk

Thanks for your valueable work. It helps so much!

My idea, better say feature request is: It is theoretical
possible to add a parameter, may be "-i", which
automatically builds an GIST index on the "the_geom"
column?

Best regards
Christian


PostGIS Development Discussion <postgis-devel at postgis.refractions.net> schrieb am 06.04.05 12:44:28:
> 
> I've finished the cleanup step.
> I didn't change anything in the algorithm, just:
> 	- made main() much more concise
> 	- fixed a bug in -w handling (for hwgeom)
> 
> Now I'd go on with:
> 
> 	- GID omission.
> 
> 	  Current loader explicitly set the gid when NOT
> 	  in append mode. I'd make it *always* act like that
> 	  and rely on the 'serial' type. Do you see any
> 	  problem with that ?
> 
> 	- TRANSACTION
> 
> 	  Current loader breaks insert lines in multiple
> 	  transaction blocks. It has been noted that this
> 	  should speed things up, but I'm afraid it would
> 	  add half-shapes (skipping the bogus transaction
> 	  block and continue with the rest). Would it be
> 	  ok for you wrapping the whole code in a single
> 	  transaction relying on dump mode (-D) for speed
> 	  reasons ?
> 
> 	- NULL shapes
> 
> 	  Current loader checks shapefile and exits with
> 	  an error if all shapes are null (or contain no
> 	  vertexes). This requires two shapefile scans
> 	  (check scan, load scan). Since the load scan
> 	  copes cleanly with null shapes what about dropping
> 	  the first scan allowing for a all-nulls shapefile
> 	  to produce an all-nulls geometry features table ?
> 	  (attributes would still be loaded!)
> 
> 	- PREPARE mode
> 
> 	  I think we all know about this ;)
> 
> 	- FIELD TYPE handling
> 
> 	  This is Marks' thread. Didn't check it out yet.
> 
> CVS contains work done so far.
> 
> --strk;
> 
> On Wed, Apr 06, 2005 at 12:25:56PM +0200, strk at refractions.net wrote:
> > Markus, I'm doing the big cleanup myself.
> > Please stop producing patches or I'll feel too guilty :P
> > I'll put the -p switch in.
> > 
> > --strk;
> > 
> > On Tue, Apr 05, 2005 at 10:44:22PM +0200, Markus Schaber wrote:
> > > Hi, @all,
> > > 
> > > Here's another version of the patch. Apart from the documentation
> > > updates, it tries to be as less invasive as possible. I even did not
> > > re-indent the if()-encapsulated main loop.
> > > 
> > > Markus
> > 
> > > Index: doc/postgis.xml
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/doc/postgis.xml,v
> > > retrieving revision 1.135
> > > diff -u -r1.135 postgis.xml
> > > --- doc/postgis.xml	5 Apr 2005 08:00:07 -0000	1.135
> > > +++ doc/postgis.xml	5 Apr 2005 20:39:38 -0000
> > > @@ -1283,13 +1283,23 @@
> > >            </varlistentry>
> > >  
> > >            <varlistentry>
> > > +            <term>-p</term>
> > > +
> > > +            <listitem>
> > > +              <para>Only produces the table creation SQL code, without adding 
> > > +              any actual data. This can be used if you need to completely
> > > +              separate the table creation and data loading steps.</para>
> > > +            </listitem>
> > > +          </varlistentry>
> > > +
> > > +          <varlistentry>
> > >              <term>-D</term>
> > >  
> > >              <listitem>
> > > -              <para>Creates a new table and populates it from the Shape file.
> > > -              This uses the PostgreSQL "dump" format for the output data and
> > > -              is much faster to load than the default "insert" SQL format. Use
> > > -              this for very large data sets.</para>
> > > +              <para>Use the PostgreSQL "dump" format for the output data. This 
> > > +              can be combined with -a, -c and -d. It is much faster to load
> > > +              than the default "insert" SQL format. Use this for very large data
> > > +              sets.</para>
> > >              </listitem>
> > >            </varlistentry>
> > >  
> > > @@ -1334,6 +1344,8 @@
> > >  
> > >          </variablelist>
> > >  
> > > +        <para>Note that -a, -c, -d and -p are mutually exclusive.</para>
> > > +
> > >          <para>An example session using the loader to create an input file and
> > >          uploading it might look like this:</para>
> > >  
> > > Index: doc/man/shp2pgsql.1
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/doc/man/shp2pgsql.1,v
> > > retrieving revision 1.3
> > > diff -u -r1.3 shp2pgsql.1
> > > --- doc/man/shp2pgsql.1	5 Apr 2005 13:43:50 -0000	1.3
> > > +++ doc/man/shp2pgsql.1	5 Apr 2005 20:39:38 -0000
> > > @@ -26,8 +26,14 @@
> > >  Creates a new table and populates it from the Shape file. This is the default mode.
> > >  
> > >  .TP 
> > > +\fB\-p\fR
> > > +Only produces the table creation SQL code, without adding any actual data. This can
> > > +be used if you need to completely separate the table creation and data loading steps.
> > > +
> > > +.TP 
> > >  \fB\-D\fR
> > > -Use the PostgreSQL "dump" format for the output data. This can be combined with -d, -a and -c and is much faster to load than the default "insert" SQL format. Use this for very large data sets.
> > > +Use the PostgreSQL "dump" format for the output data. This can be combined with -a, -c and -d.
> > > +It is much faster to load than the default "insert" SQL format. Use this for very large data sets.
> > >  
> > >  .TP 
> > >  \fB\-s\fR <\fISRID\fR>
> > > @@ -47,6 +53,9 @@
> > >  Note that this will introduce coordinate drifts and will drop
> > >  M values from shapefiles.
> > >  
> > > +.LP
> > > +Note that -a, -c, -d and -p are mutually exclusive.
> > > +
> > >  .SH "EXAMPLES"
> > >  .LP 
> > >  An example session using the loader to create an input file and uploading it might look like this:
> > > Index: loader/README.shp2pgsql
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/loader/README.shp2pgsql,v
> > > retrieving revision 1.3
> > > diff -u -r1.3 README.shp2pgsql
> > > --- loader/README.shp2pgsql	4 May 2002 22:44:04 -0000	1.3
> > > +++ loader/README.shp2pgsql	5 Apr 2005 20:39:38 -0000
> > > @@ -38,19 +38,23 @@
> > >  
> > >  The options are as follows:
> > >  
> > > -(-a || -c || -d) these options are mutually exclusive.
> > > +(-a || -c || -d || -p) these options are mutually exclusive.
> > >  
> > >    -a    Append mode. Do not delete the target table or try to create
> > >          a new table, simple insert the data into the existing table.
> > >          A table will have to exist for this to work, it is usually
> > > -        used after a create mode as been run once.(mutually exclusive
> > > -	with -c and -d)
> > > +        used after a create mode as been run once or after -p. (mutually
> > > +        exclusive with -c, -d and -p)
> > >    -c    Create mode. This is the default mode is no other is specified.
> > >  	Create a new table and upload the data into that table.
> > > -	(mutually exclusive with -a and -d)
> > > +	(mutually eclusive with -a, -d and -p)
> > >    -d    Delete mode. Delete the database table named <tablename>, then
> > >  	create a new one with that name before uploading the data into
> > > -	the new empty database table.(mutually exclusive with -a and -c)
> > > +	the new empty database table. (mutually exclusive with -a, -c 
> > > +        and -p)
> > > +  -p    Prepare mode. Read the table schema from the shape file and 
> > > +        create the new table, but do not insert any data. (mutually
> > > +        exclusive with -a, -c and -d)
> > >  
> > >    -D Dump. When inserting the data into the table use 'dump' format.
> > >  	Dump format is used by PostgreSQL for large data dumps and 
> > > Index: loader/shp2pgsql.c
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/loader/shp2pgsql.c,v
> > > retrieving revision 1.84
> > > diff -u -r1.84 shp2pgsql.c
> > > --- loader/shp2pgsql.c	4 Apr 2005 20:51:26 -0000	1.84
> > > +++ loader/shp2pgsql.c	5 Apr 2005 20:39:40 -0000
> > > @@ -532,9 +532,9 @@
> > >  	printf("BEGIN;\n");
> > >  
> > >  	//if opt is 'a' do nothing, go straight to making inserts
> > > -	if(opt == 'c' || opt == 'd') create_table();
> > > +	if(opt == 'c' || opt == 'd' || opt == 'p') create_table();
> > >  
> > > -	if (dump_format){
> > > +	if (dump_format && opt != 'p'){
> > >  		if ( schema )
> > >  		{
> > >  			printf("COPY \"%s\".\"%s\" %s FROM stdin;\n",
> > > @@ -555,8 +555,9 @@
> > >   *   MAIN SHAPE OBJECTS SCAN
> > >   * 
> > >   **************************************************************/
> > > -	for (j=0;j<num_entities; j++)
> > > -	{
> > > +	if (opt != 'p') { /*only if we do not have prepare mode*/
> > > +	    for (j=0;j<num_entities; j++)
> > > +	    {
> > >  		//wrap a transaction block around each 250 inserts...
> > >  		if ( ! dump_format )
> > >  		{
> > > @@ -646,12 +647,12 @@
> > >  		
> > >  		SHPDestroyObject(obj);	
> > >  
> > > -	} // END of MAIN SHAPE OBJECT LOOP
> > > +	    } // END of MAIN SHAPE OBJECT LOOP
> > >  
> > >  
> > > -	if ((dump_format) ) {
> > > +	    if ((dump_format) ) {
> > >  		printf("\\.\n");
> > > -
> > > +	    }
> > >  	} 
> > >  
> > >  	free(col_names);
> > > @@ -660,7 +661,7 @@
> > >  		if ( schema )
> > >  		{
> > >  			printf("\nALTER TABLE ONLY \"%s\".\"%s\" ADD CONSTRAINT \"%s_pkey\" PRIMARY KEY (gid);\n",schema,table,table);
> > > -			if(j > 1)
> > > +			if(j > 1 && opt != 'p')
> > >  			{
> > >  				printf("SELECT setval ('\"%s\".\"%s_gid_seq\"', %i, true);\n", schema, table, j-1);
> > >  			}
> > > @@ -668,7 +669,7 @@
> > >  		else
> > >  		{
> > >  			printf("\nALTER TABLE ONLY \"%s\" ADD CONSTRAINT \"%s_pkey\" PRIMARY KEY (gid);\n",table,table);
> > > -			if(j > 1){
> > > +			if(j > 1 && opt != 'p'){
> > >  				printf("SELECT setval ('\"%s_gid_seq\"', %i, true);\n", table, j-1);
> > >  			}
> > >  		}
> > > @@ -783,13 +784,14 @@
> > >  	fprintf(stderr, "OPTIONS:\n");
> > >  	fprintf(stderr, "  -s <srid>  Set the SRID field. If not specified it defaults to -1.\n");
> > >  	fprintf(stderr, "\n");
> > > -	fprintf(stderr, "  (-d|a|c) These are mutually exclusive options:\n");
> > > +	fprintf(stderr, "  (-d|a|c|p) These are mutually exclusive options:\n");
> > >  	fprintf(stderr, "      -d  Drops the table , then recreates it and populates\n");
> > >  	fprintf(stderr, "          it with current shape file data.\n");
> > >  	fprintf(stderr, "      -a  Appends shape file into current table, must be\n");
> > >  	fprintf(stderr, "          exactly the same table schema.\n");
> > >  	fprintf(stderr, "      -c  Creates a new table and populates it, this is the\n");
> > >  	fprintf(stderr, "          default if you do not specify any options.\n");
> > > +	fprintf(stderr, "      -p  Prepare mode, only creates the table\n");
> > >  	fprintf(stderr, "\n");
> > >  	fprintf(stderr, "  -g <geometry_column> Specify the name of the geometry column\n");
> > >  	fprintf(stderr, "     (mostly useful in append mode).\n");
> > > @@ -1372,7 +1374,7 @@
> > >  	int curindex=0;
> > >  	char  *ptr;
> > >  
> > > -	while ((c = getopt(ARGC, ARGV, "kcdaDs:g:iW:w")) != EOF){
> > > +	while ((c = getopt(ARGC, ARGV, "kcdapDs:g:iW:w")) != EOF){
> > >                 switch (c) {
> > >                 case 'c':
> > >                      if (opt == ' ')
> > > @@ -1392,6 +1394,12 @@
> > >                      else
> > >                           return 0;
> > >                      break;
> > > +	       case 'p':
> > > +                    if (opt == ' ')
> > > +                         opt ='p';
> > > +                    else
> > > +                         return 0;
> > > +                    break;
> > >  	       case 'D':
> > >  		    dump_format =1;
> > >                      break;
> > 
> > > _______________________________________________
> > > postgis-devel mailing list
> > > postgis-devel at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> > 
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 1452 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20050406/ee9d2c4a/attachment.bin>


More information about the postgis-devel mailing list