[gdal-dev] ogr2ogr Postgres upload performance

Alexandre Gacon alexandre.gacon at gmail.com
Mon May 23 23:11:18 PDT 2022


Hello,

The PG_USE_COPY parameter doesn't seem to work when appending several files
in the same table (replace the old data or something like that).

Postponing the creation of the spatial index does not seem to improve
things either.

I didn't try increasing the number of rows per transaction but I don't
think that the main problem is there.

I wonder if using partitioned tables won't be a good choice ?

Regards
Alexandre

Le lun. 23 mai 2022 à 21:57, Alexandre Gacon <alexandre.gacon at gmail.com> a
écrit :

> Hi,
>
> Thank you for the input. I will give a try to different combination and
> share my results.
>
> Alexandre
>
> Le lun. 23 mai 2022 à 19:43, Bo Victor Thomsen <
> bo.victor.thomsen at gmail.com> a écrit :
>
>> Hi Alexandre -
>>
>> I made a DOS script ages ago to try the different use cases. It's
>> inserted as text below. It doesn't test every setup combination, only the
>> most pertinent.
>>
>> The base gpkg layer is all the buildings i Denmark, ca. 5.8 million
>> objects.
>>
>> For my layer I had the following results:
>>
>>    1. PG_USE_COPY yes: 182 secs.
>>    2. PG_USE_COPY yes,  -gt 10000000 (5.8 mill. objects in single
>>    transaction) : 181.5 secs.
>>    3. PG_USE_COPY yes,  -gt 10000000, UNLOGGED ON, SPATIAL_INDEX NO :
>>    152 secs.
>>    4. PG_USE_COPY no : 15 min.
>>
>> So the "PG_USE_COPY yes" is crucial and using unlogged tables and no
>> spatial index reduces the time with an extra 30 sec. But you have to build
>> a spatial index afterwards.
>>
>> Using insert commands is an obvious "no-no"
>>
>> As far as I remember, I used the "PG_USE_COPY no" option without any
>> other refinements. But I didn't append data to an existing table. YMMV
>>
>>
>> ===================
>>
>> @echo on
>> set pgclientencoding=UTF-8
>> set cc= ogr2ogr -progress -lco OVERWRITE=YES -lco SCHEMA=fot -a_srs
>> EPSG:25832 -nlt PROMOTE_TO_MULTI -f "PostgreSQL" PG:"host=localhost
>> port=5432 user=xxx password=yyy dbname=zzz" d:/tmp/geodanmark.gpkg bygning
>>
>> echo %time%
>> %cc% --config PG_USE_COPY yes -nln bygning1
>> echo %time%
>> %cc% --config PG_USE_COPY yes -gt 10000000 -nln bygning2
>> echo %time%
>> %cc% --config PG_USE_COPY yes -gt 10000000 -lco UNLOGGED=ON -lco
>> SPATIAL_INDEX=NO -nln bygning3
>> echo %time%
>> %cc% --config PG_USE_COPY no  -nln bygning4
>> echo %time%
>> pause
>>
>> ===================
>>
>>
>>
>>
>> Med venlig hilsen / Kind regards
>>
>> Bo Victor Thomsen
>>
>> Den 23-05-2022 kl. 15:03 skrev Alexandre Gacon:
>>
>> Hello,
>>
>> I am using ogr2ogr to upload data from several geopackages to a postgis
>> database. Some tables contain thousands of rows (buildings for example).
>>
>> The import of the first file is quite fast (tables are created for the
>> first file so PG_USE_COPY is used) but the following file is much slower
>> (using INSERT instead of COPY).
>>
>> How could I data insertion for the other files? Force PG_USE_COPY ?
>> Increase the value of GT ? Postpone spatial index creation ?
>>
>> Should I concatenate all the geopackages first and then insert the data
>> in Postgis?
>>
>> Thank you for your help
>>
>> --
>> Alexandre Gacon
>>
>> _______________________________________________
>> gdal-dev mailing listgdal-dev at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>

-- 
Alexandre Gacon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20220524/f6ad201b/attachment.htm>


More information about the gdal-dev mailing list