[postgis-devel] Loader cleanup
christian.graefe at web.de
christian.graefe at web.de
Wed Apr 6 03:57:14 PDT 2005
Hi strk
Thanks for your valueable work. It helps so much!
My idea, better say feature request is: It is theoretical
possible to add a parameter, may be "-i", which
automatically builds an GIST index on the "the_geom"
column?
Best regards
Christian
PostGIS Development Discussion <postgis-devel at postgis.refractions.net> schrieb am 06.04.05 12:44:28:
>
> I've finished the cleanup step.
> I didn't change anything in the algorithm, just:
> - made main() much more concise
> - fixed a bug in -w handling (for hwgeom)
>
> Now I'd go on with:
>
> - GID omission.
>
> Current loader explicitly set the gid when NOT
> in append mode. I'd make it *always* act like that
> and rely on the 'serial' type. Do you see any
> problem with that ?
>
> - TRANSACTION
>
> Current loader breaks insert lines in multiple
> transaction blocks. It has been noted that this
> should speed things up, but I'm afraid it would
> add half-shapes (skipping the bogus transaction
> block and continue with the rest). Would it be
> ok for you wrapping the whole code in a single
> transaction relying on dump mode (-D) for speed
> reasons ?
>
> - NULL shapes
>
> Current loader checks shapefile and exits with
> an error if all shapes are null (or contain no
> vertexes). This requires two shapefile scans
> (check scan, load scan). Since the load scan
> copes cleanly with null shapes what about dropping
> the first scan allowing for a all-nulls shapefile
> to produce an all-nulls geometry features table ?
> (attributes would still be loaded!)
>
> - PREPARE mode
>
> I think we all know about this ;)
>
> - FIELD TYPE handling
>
> This is Marks' thread. Didn't check it out yet.
>
> CVS contains work done so far.
>
> --strk;
>
> On Wed, Apr 06, 2005 at 12:25:56PM +0200, strk at refractions.net wrote:
> > Markus, I'm doing the big cleanup myself.
> > Please stop producing patches or I'll feel too guilty :P
> > I'll put the -p switch in.
> >
> > --strk;
> >
> > On Tue, Apr 05, 2005 at 10:44:22PM +0200, Markus Schaber wrote:
> > > Hi, @all,
> > >
> > > Here's another version of the patch. Apart from the documentation
> > > updates, it tries to be as less invasive as possible. I even did not
> > > re-indent the if()-encapsulated main loop.
> > >
> > > Markus
> >
> > > Index: doc/postgis.xml
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/doc/postgis.xml,v
> > > retrieving revision 1.135
> > > diff -u -r1.135 postgis.xml
> > > --- doc/postgis.xml 5 Apr 2005 08:00:07 -0000 1.135
> > > +++ doc/postgis.xml 5 Apr 2005 20:39:38 -0000
> > > @@ -1283,13 +1283,23 @@
> > > </varlistentry>
> > >
> > > <varlistentry>
> > > + <term>-p</term>
> > > +
> > > + <listitem>
> > > + <para>Only produces the table creation SQL code, without adding
> > > + any actual data. This can be used if you need to completely
> > > + separate the table creation and data loading steps.</para>
> > > + </listitem>
> > > + </varlistentry>
> > > +
> > > + <varlistentry>
> > > <term>-D</term>
> > >
> > > <listitem>
> > > - <para>Creates a new table and populates it from the Shape file.
> > > - This uses the PostgreSQL "dump" format for the output data and
> > > - is much faster to load than the default "insert" SQL format. Use
> > > - this for very large data sets.</para>
> > > + <para>Use the PostgreSQL "dump" format for the output data. This
> > > + can be combined with -a, -c and -d. It is much faster to load
> > > + than the default "insert" SQL format. Use this for very large data
> > > + sets.</para>
> > > </listitem>
> > > </varlistentry>
> > >
> > > @@ -1334,6 +1344,8 @@
> > >
> > > </variablelist>
> > >
> > > + <para>Note that -a, -c, -d and -p are mutually exclusive.</para>
> > > +
> > > <para>An example session using the loader to create an input file and
> > > uploading it might look like this:</para>
> > >
> > > Index: doc/man/shp2pgsql.1
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/doc/man/shp2pgsql.1,v
> > > retrieving revision 1.3
> > > diff -u -r1.3 shp2pgsql.1
> > > --- doc/man/shp2pgsql.1 5 Apr 2005 13:43:50 -0000 1.3
> > > +++ doc/man/shp2pgsql.1 5 Apr 2005 20:39:38 -0000
> > > @@ -26,8 +26,14 @@
> > > Creates a new table and populates it from the Shape file. This is the default mode.
> > >
> > > .TP
> > > +\fB\-p\fR
> > > +Only produces the table creation SQL code, without adding any actual data. This can
> > > +be used if you need to completely separate the table creation and data loading steps.
> > > +
> > > +.TP
> > > \fB\-D\fR
> > > -Use the PostgreSQL "dump" format for the output data. This can be combined with -d, -a and -c and is much faster to load than the default "insert" SQL format. Use this for very large data sets.
> > > +Use the PostgreSQL "dump" format for the output data. This can be combined with -a, -c and -d.
> > > +It is much faster to load than the default "insert" SQL format. Use this for very large data sets.
> > >
> > > .TP
> > > \fB\-s\fR <\fISRID\fR>
> > > @@ -47,6 +53,9 @@
> > > Note that this will introduce coordinate drifts and will drop
> > > M values from shapefiles.
> > >
> > > +.LP
> > > +Note that -a, -c, -d and -p are mutually exclusive.
> > > +
> > > .SH "EXAMPLES"
> > > .LP
> > > An example session using the loader to create an input file and uploading it might look like this:
> > > Index: loader/README.shp2pgsql
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/loader/README.shp2pgsql,v
> > > retrieving revision 1.3
> > > diff -u -r1.3 README.shp2pgsql
> > > --- loader/README.shp2pgsql 4 May 2002 22:44:04 -0000 1.3
> > > +++ loader/README.shp2pgsql 5 Apr 2005 20:39:38 -0000
> > > @@ -38,19 +38,23 @@
> > >
> > > The options are as follows:
> > >
> > > -(-a || -c || -d) these options are mutually exclusive.
> > > +(-a || -c || -d || -p) these options are mutually exclusive.
> > >
> > > -a Append mode. Do not delete the target table or try to create
> > > a new table, simple insert the data into the existing table.
> > > A table will have to exist for this to work, it is usually
> > > - used after a create mode as been run once.(mutually exclusive
> > > - with -c and -d)
> > > + used after a create mode as been run once or after -p. (mutually
> > > + exclusive with -c, -d and -p)
> > > -c Create mode. This is the default mode is no other is specified.
> > > Create a new table and upload the data into that table.
> > > - (mutually exclusive with -a and -d)
> > > + (mutually eclusive with -a, -d and -p)
> > > -d Delete mode. Delete the database table named <tablename>, then
> > > create a new one with that name before uploading the data into
> > > - the new empty database table.(mutually exclusive with -a and -c)
> > > + the new empty database table. (mutually exclusive with -a, -c
> > > + and -p)
> > > + -p Prepare mode. Read the table schema from the shape file and
> > > + create the new table, but do not insert any data. (mutually
> > > + exclusive with -a, -c and -d)
> > >
> > > -D Dump. When inserting the data into the table use 'dump' format.
> > > Dump format is used by PostgreSQL for large data dumps and
> > > Index: loader/shp2pgsql.c
> > > ===================================================================
> > > RCS file: /home/cvs/postgis/postgis/loader/shp2pgsql.c,v
> > > retrieving revision 1.84
> > > diff -u -r1.84 shp2pgsql.c
> > > --- loader/shp2pgsql.c 4 Apr 2005 20:51:26 -0000 1.84
> > > +++ loader/shp2pgsql.c 5 Apr 2005 20:39:40 -0000
> > > @@ -532,9 +532,9 @@
> > > printf("BEGIN;\n");
> > >
> > > //if opt is 'a' do nothing, go straight to making inserts
> > > - if(opt == 'c' || opt == 'd') create_table();
> > > + if(opt == 'c' || opt == 'd' || opt == 'p') create_table();
> > >
> > > - if (dump_format){
> > > + if (dump_format && opt != 'p'){
> > > if ( schema )
> > > {
> > > printf("COPY \"%s\".\"%s\" %s FROM stdin;\n",
> > > @@ -555,8 +555,9 @@
> > > * MAIN SHAPE OBJECTS SCAN
> > > *
> > > **************************************************************/
> > > - for (j=0;j<num_entities; j++)
> > > - {
> > > + if (opt != 'p') { /*only if we do not have prepare mode*/
> > > + for (j=0;j<num_entities; j++)
> > > + {
> > > //wrap a transaction block around each 250 inserts...
> > > if ( ! dump_format )
> > > {
> > > @@ -646,12 +647,12 @@
> > >
> > > SHPDestroyObject(obj);
> > >
> > > - } // END of MAIN SHAPE OBJECT LOOP
> > > + } // END of MAIN SHAPE OBJECT LOOP
> > >
> > >
> > > - if ((dump_format) ) {
> > > + if ((dump_format) ) {
> > > printf("\\.\n");
> > > -
> > > + }
> > > }
> > >
> > > free(col_names);
> > > @@ -660,7 +661,7 @@
> > > if ( schema )
> > > {
> > > printf("\nALTER TABLE ONLY \"%s\".\"%s\" ADD CONSTRAINT \"%s_pkey\" PRIMARY KEY (gid);\n",schema,table,table);
> > > - if(j > 1)
> > > + if(j > 1 && opt != 'p')
> > > {
> > > printf("SELECT setval ('\"%s\".\"%s_gid_seq\"', %i, true);\n", schema, table, j-1);
> > > }
> > > @@ -668,7 +669,7 @@
> > > else
> > > {
> > > printf("\nALTER TABLE ONLY \"%s\" ADD CONSTRAINT \"%s_pkey\" PRIMARY KEY (gid);\n",table,table);
> > > - if(j > 1){
> > > + if(j > 1 && opt != 'p'){
> > > printf("SELECT setval ('\"%s_gid_seq\"', %i, true);\n", table, j-1);
> > > }
> > > }
> > > @@ -783,13 +784,14 @@
> > > fprintf(stderr, "OPTIONS:\n");
> > > fprintf(stderr, " -s <srid> Set the SRID field. If not specified it defaults to -1.\n");
> > > fprintf(stderr, "\n");
> > > - fprintf(stderr, " (-d|a|c) These are mutually exclusive options:\n");
> > > + fprintf(stderr, " (-d|a|c|p) These are mutually exclusive options:\n");
> > > fprintf(stderr, " -d Drops the table , then recreates it and populates\n");
> > > fprintf(stderr, " it with current shape file data.\n");
> > > fprintf(stderr, " -a Appends shape file into current table, must be\n");
> > > fprintf(stderr, " exactly the same table schema.\n");
> > > fprintf(stderr, " -c Creates a new table and populates it, this is the\n");
> > > fprintf(stderr, " default if you do not specify any options.\n");
> > > + fprintf(stderr, " -p Prepare mode, only creates the table\n");
> > > fprintf(stderr, "\n");
> > > fprintf(stderr, " -g <geometry_column> Specify the name of the geometry column\n");
> > > fprintf(stderr, " (mostly useful in append mode).\n");
> > > @@ -1372,7 +1374,7 @@
> > > int curindex=0;
> > > char *ptr;
> > >
> > > - while ((c = getopt(ARGC, ARGV, "kcdaDs:g:iW:w")) != EOF){
> > > + while ((c = getopt(ARGC, ARGV, "kcdapDs:g:iW:w")) != EOF){
> > > switch (c) {
> > > case 'c':
> > > if (opt == ' ')
> > > @@ -1392,6 +1394,12 @@
> > > else
> > > return 0;
> > > break;
> > > + case 'p':
> > > + if (opt == ' ')
> > > + opt ='p';
> > > + else
> > > + return 0;
> > > + break;
> > > case 'D':
> > > dump_format =1;
> > > break;
> >
> > > _______________________________________________
> > > postgis-devel mailing list
> > > postgis-devel at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> >
> > _______________________________________________
> > postgis-devel mailing list
> > postgis-devel at postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-devel
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-devel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 1452 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20050406/ee9d2c4a/attachment.bin>
More information about the postgis-devel
mailing list