[postgis-devel] [PostGIS] #110: shp2pgsql new option to not batch commit

PostGIS trac at osgeo.org
Wed Mar 16 10:20:58 PDT 2011


#110: shp2pgsql new option to not batch commit
--------------------------+-------------------------------------------------
  Reporter:  robe         |       Owner:  mcayland      
      Type:  enhancement  |      Status:  closed        
  Priority:  low          |   Milestone:  PostGIS Future
 Component:  postgis      |     Version:  trunk         
Resolution:  fixed        |    Keywords:                
--------------------------+-------------------------------------------------
Changes (by jadams):

  * status:  new => closed
  * resolution:  new => fixed


Old description:

> '''What steps will reproduce the problem?'''
> 1. If you have some geometries that fail insert such as polygons without
> closed rings, it kills the current batch and gives a more or less
> meaningless message
> 2. This has happened to me a lot and a lot of users we train.  A lot of
> the
> time the data should just rightfully not be added.
> 3. Write now to overcome this I generate a .sql file and use sed to
> remove
> the begin commits which means I can't use the normal | to directly load
>
> I think the simplest option is to allow users with a flag to have the
> option to not have begin commits so that everything is in its own
> transaction.  That way at least a whole batch of good records aren't lost
> because of one bad apple.
>
> I know we have talked about other options such as allowing these beasts
> into the database or nulling the geometry if invalid etc -- though those
> options are trickier to implement and may not handle all cases.

New description:

 '''What steps will reproduce the problem?'''
 1. If you have some geometries that fail insert such as polygons without
 closed rings, it kills the current batch and gives a more or less
 meaningless message
 2. This has happened to me a lot and a lot of users we train.  A lot of
 the
 time the data should just rightfully not be added.
 3. Write now to overcome this I generate a .sql file and use sed to remove
 the begin commits which means I can't use the normal | to directly load

 I think the simplest option is to allow users with a flag to have the
 option to not have begin commits so that everything is in its own
 transaction.  That way at least a whole batch of good records aren't lost
 because of one bad apple.

 I know we have talked about other options such as allowing these beasts
 into the database or nulling the geometry if invalid etc -- though those
 options are trickier to implement and may not handle all cases.

--

Comment:

 At some point (probably the refactoring into -core that happened for 1.5)
 this behavior was changed to always use a single transaction, rather than
 one per 250 records.

 I've added a "-e" command line flag that prevents using a transaction.
 Revision 6909 in trunk.

 The discussion of what to do with invalid geometries has been moved into
 its own ticket:
 http://trac.osgeo.org/postgis/ticket/859

-- 
Ticket URL: <http://trac.osgeo.org/postgis/ticket/110#comment:16>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-devel mailing list