[GRASS-dev] multiprocessing problem

Anna Petrášová kratochanna at gmail.com
Mon Apr 11 09:02:18 PDT 2022


As I said, you can sum the values for each pixel so you don't store all the
differences, that gets rid of the memory problem, but of course it will
still be slow if it's not parallelized:

vals = np.array([np.sum(np.abs(y - array.flat)) for y in array.flat])

Note that I didn't check thoroughly if the computation by itself is
correct, i.e. you get the correct value in terms of the index definition.
One other idea is to avoid some of the computations since you are in fact
computing the distances twice (distance from pixel 1 to pixel 2 and vice
versa). Also, do you actually need to compute this for the entire raster,
shouldn't this be more a moving window approach, so you would restrict the
distance computation only to a window around that pixel?

Anna

On Mon, Apr 11, 2022 at 2:09 AM Luca Delucchi <lucadeluge at gmail.com> wrote:

> On Fri, 8 Apr 2022 at 16:33, Anna Petrášová <kratochanna at gmail.com> wrote:
> >
> > Hi Luca,
> >
>
> Hi Anna,
>
> > I would say the biggest problem is the memory, I tried to run it and it
> consumes way too much memory. Maybe you could process the differences from
> each pixel (compute the sum) as they are computed, not collect it and do it
> in the end. Otherwise you can significantly speed it up simply with one
> core by using numpy in a better way:
> >
> > vals = np.array([np.abs(y - array.flat) for y in array.flat])
> > ...
> > out = np.sum(vals) / number2
> >
>
> yes, this work better then my solution, but increasing the number of
> pixels I get the process killed. I have 16GB RAM and I was not able to
> process 80000 cells....
> I tried a few different solutions but the result is always the same.
>
> --
> ciao
> Luca
>
> www.lucadelu.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20220411/c9e72878/attachment.html>


More information about the grass-dev mailing list