[postgis-users] How does PostGIS / PostgreSQL distribute parallel work to cores for GIS type data?
Paul Ramsey
pramsey at cleverelephant.ca
Mon Mar 2 09:08:43 PST 2020
> On Mar 1, 2020, at 3:36 AM, Marco Boeringa <marco at boeringa.demon.nl> wrote:
>
> Although it is hard to give figures here, because I do not have a fully equivalent non-multi threaded processing flow, I do see significant benefits from distributing records based on vertex complexity.
Yes and no. The executor does say “I have N records and C cores, so every core gets N/C records”.
It says “I still have records, here Core 1, have 10K”. “I still have records, here Core 2, have 10K”, and so on. The chunks are generally smaller than N/C, so the net effect over a large table is that all the cores stay busy most of the time.
In theory, carefully optimizing by handing out records based on vertices is a thing, in practice, it’s not a big deal.
One nuance I’m not 100% sure of is if the master hands out records to workers in matches of num_records, or batches of num_pages. If the latter, then the scheme would be very much like you propose anyways, since large records would take up more space on a page, and data volume would determine distribution, not record volume.
ATB,
P
More information about the postgis-users
mailing list