[postgis-users] On UpdateGeometrySRID() performance

Martin Davis mtnclimb at gmail.com
Thu Nov 1 10:13:41 PDT 2018


A good entry for the FAQ?

On Thu, Nov 1, 2018 at 10:08 AM James Klassen <klassen.js at gmail.com> wrote:

> When I need to load and transform a large dataset before it lands in its
> final tables/rows, I generally use unlogged tables for the intermediate
> steps.  This saves a lot of disk writing (especially so if running
> PostgreSQL on a storage layer doing its own copy-on-write).  The downside
> is the unlogged tables won't be saved if for whatever reason PostgreSQL
> unexpectedly stops, but that generally doesn't matter in this case because
> you can just restart the load.
>
> On Thu, Nov 1, 2018, 11:54 Paul Ramsey <pramsey at cleverelephant.ca> wrote:
>
>> Also, if you created an index during your import process, that'll also
>> make your update slower.
>> I usually skip bulk updates in favour of writing a fresh table with the
>> stuff I want..
>>
>> CREATE TABLE mynewtable AS SELECT ST_SetSRID(geom, 4326) AS geom, ...
>> FROM myoldtable
>>
>> That generally writes as fast as is possible though still only single
>> threaded.
>>
>> P
>>
>>
>> On Thu, Nov 1, 2018 at 9:51 AM Darafei "Komяpa" Praliaskouski <
>> me at komzpa.net> wrote:
>>
>>> Hi,
>>>
>>> Please feed in your WKT polygons in EWKT format:
>>>
>>> SRID=4326;POLYGON(...)
>>>
>>> Since every update in Postgres is essentially Delete+Insert, time of
>>> rewriting each and every row being equal to initial Insert time is expected
>>> thing.
>>>
>>> You can also update SRID in two columns in one go:
>>>
>>> update tablename set geom1 = ST_SetSRID(geom1, 4326), geom2 =
>>> ST_SetSRID(geom2, 4326);
>>>
>>> Out of curiosity, where did you learn about UpdateGeometrySRID before
>>> learning about ST_SetSRID?
>>>
>>> ср, 31 окт. 2018 г. в 21:57, jerry73204 <jerry73204 at gmail.com>:
>>>
>>>> Hi all,
>>>>
>>>> I'm stuck in the low performance of UpdateGeometrySRID().
>>>>
>>>> I get started with a 50GB polygon dataset in CSV in EPSG:4326
>>>> coordinates. Since I find no way to `\copy` the csv while preserving
>>>> the
>>>> SRID, the data is imported with null SRID and then `SELECT
>>>> UpdateGeometrySRID('table', 'column', 4326)`.
>>>>
>>>> The `UpdateGeometrySRID()` takes as long time as that of `\copy`, which
>>>> turns out to be approx two hours. The dataset has two geometry columns
>>>> and thus I have to take triple time to finish this data.
>>>>
>>>> I profiled the postgresql daemon. The avg disk writing speed is 30MB/s,
>>>> while occasionally peaks to 100MB/s. The SSD, F2FS formatted disk is
>>>> capable of up to 150MB/s. The daemon does not utilize the 4-core
>>>> i5-7600k CPU. It seems to be a single process task with avg CPU load
>>>> 20%, while other workers are idle. I wonder if there's a room for
>>>> improving the performance. Also, I'm looking for if it's possible to
>>>> preserve SRID with `\copy`.
>>>>
>>>> Jerry Lin
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at lists.osgeo.org
>>>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>>>
>>> --
>>> Darafei Praliaskouski
>>> Support me: http://patreon.com/komzpa
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20181101/b1355093/attachment.html>


More information about the postgis-users mailing list