[QGIS-Developer] Postgis loading using COPY instead of INSERTS?

Sat Jun 10 05:11:21 PDT 2017

On 19 May 2017 at 17:04, Régis Haubourg <regis.haubourg at gmail.com> wrote:
> Forwarding Nyall tips, might be usefull for others
>
>> I tested drag and drop in browser and dbmanager (in 2.18.7 and master) and
>> couldn't find anything about that in the issues or the code base.
>
> Both of these codepaths use QgsVectorLayerExporter, which batches
> calls to QgsVectorDataProvider::addFeatures in groups of 200 features.
> So there's definitely the potential for a good speed bump here by
> moving to the COPY command.
>
>> Did I miss something? Is there someone working in that area?
>
> I don't think so. The tricky bit would be handling the "RETURNING"
> part of the INSERT command, which is used to update the add features
> with their new feature IDs/primary keys. I'm not sure that's possible
> using a COPY command.

I've been thinking here...

I think it may be worthwhile to add a "flags" argument to the vector
data provider addFeature(s) methods (i.e. add it to the base class
methods in QgsFeatureSink). Initially we could have a single available
flag - something like "SkipFidUpdates". If set, providers could then
choose to optimise the way they add features to the backend store
given that the updated feature ids aren't required. We could then
safely use COPY in the postgres provider when this flag is set.

I'd estimate 90+% of the time we are bulk adding features we don't
care about the updated feature ids, and it's only when adding small
amounts of features that this updated id is used. Definitely for
things like QgsVectorLayerExporter it's not required, so we'd get the
COPY speed boost used.

Thoughts?
Nyall