[gdal-dev] Slow convertion from OSM to PG with -skipfailures

Rahkonen Jukka jukka.rahkonen at mmmtike.fi
Mon May 27 13:35:45 PDT 2013


Even Rouault wrote:

>
>> Don't you feel that the price is rather high? In this example like 120
>> minutes vs. 2 minutes. Could you imagine a Speedy Sanitizer (TM) option
>> which would use a small super fast in-memory container for collecting
>> something like one thousand valid features before flushing them into the
>> database?

> Actually that would be more complicated than that. You only know if features
> are valid once you have submitted them to the database (otherwise we could
> discard them before). So the idea would rather be to retain in memory the
> features of a transaction. If the transaction succeeds, fine. If it fails, you
> then need to resubmit the features one by one in single feature transactions
> (but if you have statistically at least one failure by multiple feature
> transactions, then you'd better just use single feature transaction). Another
> problem I see is that currently, ogr2ogr doesn't detect failures in COPY mode,
> since the error will be caught by PostgreSQL at COMMIT time, and the error
> propagation isn't done properly currently.

If they were geometry errors I guess I would have a try by using
-dialect SQLite -sql "select * from layer where IsValid(geometry)=1"

That should work very well with PostGIS because the IsValid function in Spatialite comes from GEOS.  If the problem comes from attributes which are not accepted by the target datastore I fear it will be hard to solve because I do not believe there can be any general method for validating attributes between all the possible formats.
I came to think this because some OSM tags were not accepted by PostGIS hstore. It is perhaps a bug in GDAL and can be corrected that way. Theoretically data errors would be best to correct in the source data but user may not have a software for doing that of the source format can be hard to handle. I quess that the OSM protobuf format belongs to the latter group.

-Jukka Rahkonen- 



More information about the gdal-dev mailing list