[postgis-devel] ST_Union Parallel Experiment

Regina Obe lr at pcorp.us
Tue Mar 13 08:14:50 PDT 2018


 

>  I think the benchmark you do here does not cover a common case of big table grouped by some attributive column. If the dataset is reasonably clustered, and 

> number of threads is smaller than number of groups, one can expect a Parallel Seq Scan to bring all the rows for one group most of the time, so that Cascaded 

> Union is performed in parallel worker and then main worker is just passing the result upwards. Costs adjustments can be tricky for that though.

 

I may have misunderstood how this works, but in the case you describe I thought the ST_Union would happen after data is partitoned to each worker node so the union step wouldn't be parallelized but would occur in each worker so would run in parallel for each set of groups.

Since ST_Union is marked as safe, wouldn't it be already taking advantage of this?

 

To be honest I've never tested this out but that was the main impetus for marking ST_Union parallel safe to allow the ST_Union to still happen in a worker node.

 

Thanks,

Regina

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20180313/16da42a7/attachment-0001.html>


More information about the postgis-devel mailing list