<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">

<meta name="Generator" content="Microsoft Word 15 (filtered medium)">

<style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-family:"Calibri",sans-serif;}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

--></style>

</head>

<body lang="EN-US" link="blue" vlink="#954F72">

<div class="WordSection1">

<p class="MsoNormal">I need to union chunks of MultiLineString together efficiently before feeding each chunk to st_polygonize. I know that the st_union aggregate function does not parallelize the actual cascaded union operation; therefore I collected the MultiLineStrings

 into a table of GeometryCollections where each row is 20 MB each with roughly 20 rows in the table.</p>

<p class="MsoNormal"><o:p> </o:p></p>

<p class="MsoNormal">I then set the table for maximum parallelization on sequential scan via</p>

<p class="MsoNormal">ALTER TABLE linework set (parallel_workers = 8);</p>

<p class="MsoNormal"><br>

And I then execute the simplest possible query in order to maximize parallelization:</p>

<p class="MsoNormal">set local effective_cache_size = '30GB';</p>

<p class="MsoNormal">set local maintenance_work_mem = '2GB';</p>

<p class="MsoNormal">set local default_statistics_target = 100;</p>

<p class="MsoNormal">set local random_page_cost = '1.1';</p>

<p class="MsoNormal">set local effective_io_concurrency = 200;</p>

<p class="MsoNormal">set local parallel_tuple_cost=0;</p>

<p class="MsoNormal">set local parallel_setup_cost = 0;</p>

<p class="MsoNormal">set local max_parallel_workers_per_gather = 8;</p>

<p class="MsoNormal">set local max_parallel_workers = 8;</p>

<p class="MsoNormal">set local max_parallel_maintenance_workers = 4;</p>

<p class="MsoNormal">set local min_parallel_table_scan_size = '1kB';</p>

<p class="MsoNormal"><o:p> </o:p></p>

<p class="MsoNormal">SELECT st_unaryunion(geom,.01) from linework;</p>

<p class="MsoNormal"><o:p> </o:p></p>

<p class="MsoNormal">As EXPLAIN ANALYSE shows, I do indeed get the maximum possible parallelization, however the query still takes  the exact  same amount of time it takes as if executed sequentially:<br>

<br>

</p>

<p class="MsoNormal">Gather  (cost=1000.00..5113.59 rows=1270 width=32) (actual time=13618.949..182565.567 rows=19 loops=1)</p>

<p class="MsoNormal">"  Output: (st_unaryunion(geom, '0.01'::double precision))"</p>

<p class="MsoNormal">  Workers Planned: 8</p>

<p class="MsoNormal">  Workers Launched: 7</p>

<p class="MsoNormal">  ->  Parallel Seq Scan on linework  (cost=0.00..3986.59 rows=159 width=32) (actual time=1702.296..22820.584 rows=2 loops=8)</p>

<p class="MsoNormal">"        Output: st_unaryunion(geom, '0.01'::double precision)"</p>

<p class="MsoNormal">        Worker 0:  actual time=0.002..0.002 rows=0 loops=1</p>

<p class="MsoNormal">        Worker 1:  actual time=0.002..0.002 rows=0 loops=1</p>

<p class="MsoNormal">        Worker 2:  actual time=0.001..0.001 rows=0 loops=1</p>

<p class="MsoNormal">        Worker 3:  actual time=0.001..0.002 rows=0 loops=1</p>

<p class="MsoNormal">        Worker 4:  actual time=0.001..0.001 rows=0 loops=1</p>

<p class="MsoNormal">        Worker 5:  actual time=0.002..0.002 rows=0 loops=1</p>

<p class="MsoNormal">        Worker 6:  actual time=0.002..0.002 rows=0 loops=1</p>

<p class="MsoNormal">Planning Time: 0.072 ms</p>

<p class="MsoNormal">Execution Time: 182565.628 ms<br>

<br>

So it looks like postgres is scanning the table in parallel but for some inexplicable reason running the massively expensive function AFTER the gather which completely defeats the purpose of a parallel scan! Perhaps something related to TOAST tables? Again

 this is st_unaryunion, not st_union so I am perplexed as to why this is failing to work as intended. St_unaryunion is the only function in the entire query and is thus the only thing that can possibly be parallelized -and yet it isn’t despite 7 workers being

 spawned.<br>

<br>

I also tried manually plitting the query up via UNION ALL:<br>

<br>

SELECT st_unaryunion(geom,.01) FROM linework where chunk_id =1 <br>

UNION ALL<br>

SELECT st_unaryunion(geom,.01) FROM linework where chunk_id =2</p>

<p class="MsoNormal">UNION ALL<br>

SELECT st_unaryunion(geom,.01) FROM linework where chunk_id =3</p>

<p class="MsoNormal">UNION ALL<br>

SELECT st_unaryunion(geom,.01) FROM linework where chunk_id =4<br>

<br>

which also somehow manages to execute sequentially.<br>

<br>

How can I force the query planner to do the sane thing and distribute st_unaryunion function on each chunk to CPU cores evenly? This seems like it should be possible for such a simple query on a small number so rows. I know I could try using smaller chunks,

 but that would create other performance problems not related to this query.<br>

<br>

POSTGIS="3.1.0alpha3 b2221ee48" [EXTENSION] PGSQL="130" GEOS="3.9.0dev-CAPI-1.14.0" PROJ="7.2.0" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.3.1" WAGYU="0.5.0 (Internal)" TOPOLOGY</p>

<p class="MsoNormal"><o:p> </o:p></p>

<p class="MsoNormal"><o:p> </o:p></p>

<p class="MsoNormal">Sent from <a href="https://go.microsoft.com/fwlink/?LinkId=550986">

Mail</a> for Windows 10</p>

<p class="MsoNormal"><o:p> </o:p></p>

</div>

</body>

</html>