[Gdal-dev] Re: Improving postgresql performance by using COPY

Charlie Savage cfis at interserv.com
Fri Oct 14 11:44:44 EDT 2005


Thanks for the quick feedback.

I definitely agree with both Frank and Charlton that INSERT should be 
the default mode since it is more robust.  COPY support is useful in 
cases where you are loading a lot of data - when using INSERT would take 
days to complete.  Thus to use COPY I imagine you would have to 
specifically ask for it by passing a command line parameter via ogr2ogr 
or some API.

Frank - in my case I need the COPY support to work for creating new 
tables and adding data to existing tables.  Thus I need some way of 
hooking into the start of transfer and end.  I could hijack the 
StartTransaction / EndTransaction methods, but I thought that was a bad 
idea because a) its not what they are meant for b) it would not be a 
backwards compatible change since you'd have to pass options to 
StartTransaction / EndTransaction.  That why I came up with 
StartTranslation / EndTranslation.

To minimize the impact in OGR, you could do:

transfer
{
	StartTransaction
	StartTranslation

	<process features>

	EndTranslation
	EndTransaction
}

That's actually what I did in the patch.  But I can't say I love it.

I would be very interested if you have other ideas on how to approach 
this.  I also wonder if this type of issue occurs in other data sources 
(maybe MySql or Oracle as you mentioned?).

Charlie


Charlton Purvis wrote:
> Hi, folks:
> 
>> How important do you think it is to be able to use COPY mode to
>> add features to an existing table, as opposed to just using it to create
> 
> Sorry I'm coming into this late -- I didn't see the original msg. go by
> before I hit del.  Bad me.
> 
> There is no doubt that COPY is lightening fast when compared to INSERT
> statements.  But one disadvantage you have, especially when appending
> incoming values to a table via ogr2ogr, is that if one row of a COPY fails,
> the whole thing fails.  At least w/ INSERT statements, those that make it,
> can, and those that don't simply error out.  And in previous use cases for
> my work, failing was entirely OK (PK constraints), but I wanted the rest of
> the operation to continue like normal.  So I wonder what it would look like
> to an inexperienced user to see that a COPY failed if it were the fault of
> only 1 row -- would they think the problem is in ogr since nothing went
> through?  The DB?  
> 
> And I'm not sure COPYs are very robust across different DB platforms.
> 
> I absolutely vote for INSERTs to be the default.
> 
> Charlton





More information about the Gdal-dev mailing list