[gdal-dev] Slow convertion from OSM to PG with -skipfailures

Stephen Woodbridge woodbri at swoodbridge.com
Mon May 27 13:05:24 PDT 2013


On 5/27/2013 3:06 PM, Even Rouault wrote:
> Le lundi 27 mai 2013 20:52:57, Stephen Woodbridge a écrit :
>> Even,
>>
>> Would it make sense to create and load a stored procedure that you use
>> to load the feature with, then you can trap exceptions and ignore them.
>> I would think that this would be much faster. Well the stored procedure
>> is run in a transaction bout you could do something like:
>>
>> begin;
>> select loadfeatureskipfailures(<args>) from features where gid between
>> gid_from and gid_to;
>> commit;
>>
>> And run this in a loop incrementing gid_from and gid_to over your range
>> of features.
>>
>> I would think that something along these lines would be faster than
>> doing every insert in a separate transaction.
>
> I'm afraid I'm not familiar enough with PG stored procedures to really assess
> the feasability of your suggestion. What is the features table in your example
> ? How is it populated ? And I'm a bit skeptical that this could catch and
> recover errors in a better way than doing it from the outside.
>

Hi Even,

My reference to features was a generic reference to the incoming stream 
of feature to be loaded.

In a plpgsql function you can do something like this:

...
BEGIN
   INSERT ....
EXCEPTION
   WHEN SQLSTATE NOT '00000' THEN
     RAISE NOTICE 'Exception % caught, ignoring', SQLSTATE;
     -- NOOP
END;
...

http://www.postgresql.org/docs/9.0/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING

Anyway, my idea is that the function can recover from the error, so if 
you can feed your data in chucks through the function that can ignore 
errors that this might be faster than

BEGIN;
INSERT ...
COMMIT;

for every feature, but it might not be because a plpgsql function 
effectively runs inside of an implicit transaction. I think it would be 
faster if you could do something like:

BEGIN:
SELECT loadfeature(<args>);
SELECT loadfeature(<args>);
SELECT loadfeature(<args>);
...
SELECT loadfeature(<args>);
SELECT loadfeature(<args>);
SELECT loadfeature(<args>);
COMMIT;

Since none of the loadfeature() calls will fail if you trap the 
exceptions that you are interested in ignoring. This would allow you to 
insert a block of feature in a transaction and ignore any failures.

Without some testing I can't be sure of the performance implication.

-Steve W


More information about the gdal-dev mailing list