[postgis-users] On UpdateGeometrySRID() performance
jerry73204 at gmail.com
Thu Nov 1 00:57:41 PDT 2018
I'm stuck in the low performance of UpdateGeometrySRID().
I get started with a 50GB polygon dataset in CSV in EPSG:4326
coordinates. Since I find no way to `\copy` the csv while preserving the
SRID, the data is imported with null SRID and then `SELECT
UpdateGeometrySRID('table', 'column', 4326)`.
The `UpdateGeometrySRID()` takes as long time as that of `\copy`, which
turns out to be approx two hours. The dataset has two geometry columns
and thus I have to take triple time to finish this data.
I profiled the postgresql daemon. The avg disk writing speed is 30MB/s,
while occasionally peaks to 100MB/s. The SSD, F2FS formatted disk is
capable of up to 150MB/s. The daemon does not utilize the 4-core
i5-7600k CPU. It seems to be a single process task with avg CPU load
20%, while other workers are idle. I wonder if there's a room for
improving the performance. Also, I'm looking for if it's possible to
preserve SRID with `\copy`.
More information about the postgis-users