[postgis-users] Forcibly parallelizing very expensive st_unaryunion in sequential scan on 20 row MultiLineString Table

Andrew Joseph ap.joseph at live.com
Fri Nov 27 15:24:08 PST 2020


> Workaround: 'pyodbc' or 'psycopg2'

Hey Marco, I probably should have specified I’m trying to keep all my logic in a plpgsql function so that all work can be done in a single transaction. dblink is basically the same concept as psycopg2 -I can create 7 connections to do the work and then wait for them to complete, but then I’d have to commit the source table so that the connections I spawn would be able to see it and also make sure those connections are cleaned up.

Postgres parallelism handles all of that -and I’m getting the max number of workers. The issue I appear to be having is that postgres seems to be allocating all of the rows to a single worker because all of the rows are under block size due to toast and the parallelism is blockwise rather than rowwise.

So I think what I need to do is figure out how to get the rows to be just a bit under the block size so that each worker is reading one row/block.

To this effect, I tried making a dummy text column and filling with lpad(‘’,2000,’0’) and then setting the storage parameter toast_tuple_target = 8160 (max) for the table assuming that would force reading of one row/block and cancel out the TOASTing of the geometry column -but this sadly did not succeed.

It is good to know that the psycopg2 type approach holds up on a vast number of cores though.

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20201127/04961b3c/attachment.html>


More information about the postgis-users mailing list