<div dir="ltr">Hi Luca,<div><br></div><div>I would say the biggest problem is the memory, I tried to run it and it consumes way too much memory. Maybe you could process the differences from each pixel (compute the sum) as they are computed, not collect it and do it in the end. Otherwise you can significantly speed it up simply with one core by using numpy in a better way:</div><div><br></div><div>vals = np.array([np.abs(y - array.flat) for y in array.flat])<br></div><div>...</div><div>out = np.sum(vals) / number2<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 8, 2022 at 5:17 AM Stefan Blumentrath <<a href="mailto:Stefan.Blumentrath@nina.no">Stefan.Blumentrath@nina.no</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Ciao Luca,<br>
<br>
Yes, you could also consider looping over e.g. rows (maybe in combination with "np.apply_along_axis") so you could put results easier back together to a map if needed at a later stage.<br>
<br>
In addition, since you use multiprocessing.Manager, you may try to use multiprocessing.Array: <a href="https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Array" rel="noreferrer" target="_blank">https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Array</a><br>
<br>
E.g. here:<br>
<a href="https://github.com/lucadelu/grass-addons/blob/5ca56bdb8b3394ebeed23aa5b3240bf6690e51bf/src/raster/r.raoq.area/r.raoq.area.py#L81" rel="noreferrer" target="_blank">https://github.com/lucadelu/grass-addons/blob/5ca56bdb8b3394ebeed23aa5b3240bf6690e51bf/src/raster/r.raoq.area/r.raoq.area.py#L81</a><br>
<br>
According to the post here: <a href="https://medium.com/analytics-vidhya/using-numpy-efficiently-between-processes-1bee17dcb01" rel="noreferrer" target="_blank">https://medium.com/analytics-vidhya/using-numpy-efficiently-between-processes-1bee17dcb01</a><br>
multiprocessing.Array is needed to put the numpy array into shared memory and avoid pickling.<br>
<br>
I have not tried or investigated myself, but maybe worth a try...<br>
<br>
Cheers<br>
Stefan<br>
<br>
-----Original Message-----<br>
From: grass-dev <<a href="mailto:grass-dev-bounces@lists.osgeo.org" target="_blank">grass-dev-bounces@lists.osgeo.org</a>> On Behalf Of Luca Delucchi<br>
Sent: fredag 8. april 2022 10:46<br>
To: Moritz Lennert <<a href="mailto:mlennert@club.worldonline.be" target="_blank">mlennert@club.worldonline.be</a>><br>
Cc: GRASS-dev <<a href="mailto:grass-dev@lists.osgeo.org" target="_blank">grass-dev@lists.osgeo.org</a>><br>
Subject: Re: [GRASS-dev] multiprocessing problem<br>
<br>
On Fri, 8 Apr 2022 at 09:14, Moritz Lennert <<a href="mailto:mlennert@club.worldonline.be" target="_blank">mlennert@club.worldonline.be</a>> wrote:<br>
><br>
> Hi Luca,<br>
><br>
<br>
Hi Moritz,<br>
<br>
> Just two brainstorming ideas:<br>
><br>
> - From a rapid glance at the code it seems to me that you create a separate worker for each row in the raster. Correct ? AFAIR, spawning workers does create quite a bit of overhead. Depending on the row to column ratio of your raster, maybe you would be better off sending larger chunks to workers ?<br>
><br>
<br>
right now I creating a worker for each pixel to be checked against all the other pixels, yes it could be and idea to send larger chunks, I could split the array vertically according to the number of processor<br>
<br>
> - Depending on the number of parallel jobs, disk access can quickly become the bottleneck on non parallelized file systems. So it would be interesting to see if using fewer processes might actually speed up things. Then it is a question of finding the equilibrium.<br>
><br>
<br>
ok, this make sense<br>
thanks for your support<br>
<br>
> Moritz<br>
><br>
<br>
--<br>
ciao<br>
Luca<br>
<br>
<a href="https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.lucadelu.org%2F&data=04%7C01%7CStefan.Blumentrath%40nina.no%7C8821ac9b35674720f9b908da193c3cab%7C6cef373021314901831055b3abf02c73%7C0%7C0%7C637850043869911903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xt8x5QeXm3h1eJIYq9aRbBMHAWXaaYAI9yYNqKMj3mg%3D&reserved=0" rel="noreferrer" target="_blank">https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.lucadelu.org%2F&data=04%7C01%7CStefan.Blumentrath%40nina.no%7C8821ac9b35674720f9b908da193c3cab%7C6cef373021314901831055b3abf02c73%7C0%7C0%7C637850043869911903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xt8x5QeXm3h1eJIYq9aRbBMHAWXaaYAI9yYNqKMj3mg%3D&reserved=0</a><br>
_______________________________________________<br>
grass-dev mailing list<br>
<a href="mailto:grass-dev@lists.osgeo.org" target="_blank">grass-dev@lists.osgeo.org</a><br>
<a href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgrass-dev&data=04%7C01%7CStefan.Blumentrath%40nina.no%7C8821ac9b35674720f9b908da193c3cab%7C6cef373021314901831055b3abf02c73%7C0%7C0%7C637850043869911903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=5Dd9Az3ZqLwd5wS7A9dM5jJz8boqwE3%2FPJFBK8texCQ%3D&reserved=0" rel="noreferrer" target="_blank">https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgrass-dev&data=04%7C01%7CStefan.Blumentrath%40nina.no%7C8821ac9b35674720f9b908da193c3cab%7C6cef373021314901831055b3abf02c73%7C0%7C0%7C637850043869911903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=5Dd9Az3ZqLwd5wS7A9dM5jJz8boqwE3%2FPJFBK8texCQ%3D&reserved=0</a><br>
_______________________________________________<br>
grass-dev mailing list<br>
<a href="mailto:grass-dev@lists.osgeo.org" target="_blank">grass-dev@lists.osgeo.org</a><br>
<a href="https://lists.osgeo.org/mailman/listinfo/grass-dev" rel="noreferrer" target="_blank">https://lists.osgeo.org/mailman/listinfo/grass-dev</a><br>
</blockquote></div>