[Gdal-dev] Re: Improving postgresql performance by using COPY
Charlie Savage
cfis at interserv.com
Fri Oct 14 11:44:44 EDT 2005
Thanks for the quick feedback.
I definitely agree with both Frank and Charlton that INSERT should be
the default mode since it is more robust. COPY support is useful in
cases where you are loading a lot of data - when using INSERT would take
days to complete. Thus to use COPY I imagine you would have to
specifically ask for it by passing a command line parameter via ogr2ogr
or some API.
Frank - in my case I need the COPY support to work for creating new
tables and adding data to existing tables. Thus I need some way of
hooking into the start of transfer and end. I could hijack the
StartTransaction / EndTransaction methods, but I thought that was a bad
idea because a) its not what they are meant for b) it would not be a
backwards compatible change since you'd have to pass options to
StartTransaction / EndTransaction. That why I came up with
StartTranslation / EndTranslation.
To minimize the impact in OGR, you could do:
transfer
{
StartTransaction
StartTranslation
<process features>
EndTranslation
EndTransaction
}
That's actually what I did in the patch. But I can't say I love it.
I would be very interested if you have other ideas on how to approach
this. I also wonder if this type of issue occurs in other data sources
(maybe MySql or Oracle as you mentioned?).
Charlie
Charlton Purvis wrote:
> Hi, folks:
>
>> How important do you think it is to be able to use COPY mode to
>> add features to an existing table, as opposed to just using it to create
>
> Sorry I'm coming into this late -- I didn't see the original msg. go by
> before I hit del. Bad me.
>
> There is no doubt that COPY is lightening fast when compared to INSERT
> statements. But one disadvantage you have, especially when appending
> incoming values to a table via ogr2ogr, is that if one row of a COPY fails,
> the whole thing fails. At least w/ INSERT statements, those that make it,
> can, and those that don't simply error out. And in previous use cases for
> my work, failing was entirely OK (PK constraints), but I wanted the rest of
> the operation to continue like normal. So I wonder what it would look like
> to an inexperienced user to see that a COPY failed if it were the fault of
> only 1 row -- would they think the problem is in ogr since nothing went
> through? The DB?
>
> And I'm not sure COPYs are very robust across different DB platforms.
>
> I absolutely vote for INSERTs to be the default.
>
> Charlton
More information about the Gdal-dev
mailing list