[GRASS-dev] multiprocessing problem

Anna Petrášová kratochanna at gmail.com
Fri Apr 8 07:33:43 PDT 2022


Hi Luca,

I would say the biggest problem is the memory, I tried to run it and it
consumes way too much memory. Maybe you could process the differences from
each pixel (compute the sum) as they are computed, not collect it and do it
in the end. Otherwise you can significantly speed it up simply with one
core by using numpy in a better way:

vals = np.array([np.abs(y - array.flat) for y in array.flat])
...
out = np.sum(vals) / number2


On Fri, Apr 8, 2022 at 5:17 AM Stefan Blumentrath <
Stefan.Blumentrath at nina.no> wrote:

> Ciao Luca,
>
> Yes, you could also consider looping over e.g. rows (maybe in combination
> with "np.apply_along_axis") so you could put results easier back together
> to a map if needed at a later stage.
>
> In addition, since you use multiprocessing.Manager, you may try to use
> multiprocessing.Array:
> https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Array
>
> E.g. here:
>
> https://github.com/lucadelu/grass-addons/blob/5ca56bdb8b3394ebeed23aa5b3240bf6690e51bf/src/raster/r.raoq.area/r.raoq.area.py#L81
>
> According to the post here:
> https://medium.com/analytics-vidhya/using-numpy-efficiently-between-processes-1bee17dcb01
> multiprocessing.Array is needed to put the numpy array into shared memory
> and avoid pickling.
>
> I have not tried or investigated myself, but maybe worth a try...
>
> Cheers
> Stefan
>
> -----Original Message-----
> From: grass-dev <grass-dev-bounces at lists.osgeo.org> On Behalf Of Luca
> Delucchi
> Sent: fredag 8. april 2022 10:46
> To: Moritz Lennert <mlennert at club.worldonline.be>
> Cc: GRASS-dev <grass-dev at lists.osgeo.org>
> Subject: Re: [GRASS-dev] multiprocessing problem
>
> On Fri, 8 Apr 2022 at 09:14, Moritz Lennert <mlennert at club.worldonline.be>
> wrote:
> >
> > Hi Luca,
> >
>
> Hi Moritz,
>
> > Just two brainstorming ideas:
> >
> > - From a rapid glance at the code it seems to me that you create a
> separate worker for each row in the raster. Correct ? AFAIR, spawning
> workers does create quite a bit of overhead. Depending on the row to column
> ratio of your raster, maybe you would be better off sending larger chunks
> to workers ?
> >
>
> right now I creating a worker for each pixel to be checked against all the
> other pixels, yes it could be and idea to send larger chunks, I could split
> the array vertically according to the number of processor
>
> > - Depending on the number of parallel jobs, disk access can quickly
> become the bottleneck on non parallelized file systems. So it would be
> interesting to see if using fewer processes might actually speed up things.
> Then it is a question of finding the equilibrium.
> >
>
> ok, this make sense
> thanks for your support
>
> > Moritz
> >
>
> --
> ciao
> Luca
>
>
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.lucadelu.org%2F&data=04%7C01%7CStefan.Blumentrath%40nina.no%7C8821ac9b35674720f9b908da193c3cab%7C6cef373021314901831055b3abf02c73%7C0%7C0%7C637850043869911903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xt8x5QeXm3h1eJIYq9aRbBMHAWXaaYAI9yYNqKMj3mg%3D&reserved=0
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
>
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgrass-dev&data=04%7C01%7CStefan.Blumentrath%40nina.no%7C8821ac9b35674720f9b908da193c3cab%7C6cef373021314901831055b3abf02c73%7C0%7C0%7C637850043869911903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=5Dd9Az3ZqLwd5wS7A9dM5jJz8boqwE3%2FPJFBK8texCQ%3D&reserved=0
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20220408/153f4784/attachment-0001.html>


More information about the grass-dev mailing list