[GRASS-dev] [SoC] GSoC 2021 - Final Report [Parallelization of raster modules for GRASS GIS]

Veronica Andreo veroandreo at gmail.com
Tue Aug 24 05:48:19 PDT 2021


Dear Aaron,

Thanks for your report and all your work to make GRASS raster modules run
faster, we like that :) Huge thanks as well to Vaclav, Huidae and Maris for
their commitment to the project and mentoring! You all did a great team
work!

One question: are these changes planned to be merged before the creation of
grass 8 branch or will they remain for 8+? Maybe add a milestone to
clarify?
One minor observation: all links currently point to the same PR

We hope to keep seeing you around, Aaron!

All the best,
Vero

El mar, 24 ago 2021 a las 3:18, Aaron Saw Min Sern (<aaronsms at u.nus.edu>)
escribió:

> Hello everyone,
>
> Here is my final report for GSoC 2021 project, *Parallelization of Raster
> Modules for GRASS GIS*.
>
> *Abstract*
> The goal of this project is to introduce parallelization to existing
> raster modules in GRASS GIS using OpenMP. This will allow users to take
> advantage of more cores in their hardware to speed up the computation time
> especially for large raster files with large computation cost. The key
> challenge of this project is to separate the parallelizable components from
> the sequential part of the modules without introducing too much overhead in
> terms of memory, disk or computation resources
> *. *
>
> *Milestones*
>
> In total, I have introduced OpenMP support to 8 raster modules in GRASS
> GIS. The pull requests to each module are as follows:
>
>    - r.univar - https://github.com/OSGeo/grass/pull/1634
>    - r.neighbors - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1724
>    <https://github.com/OSGeo/grass/pull/1724>
>    - r.mfilter - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1708
>    <https://github.com/OSGeo/grass/pull/1708>
>    - r.resamp.filter - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1759
>    <https://github.com/OSGeo/grass/pull/1759>
>    - r.resamp.interp - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1771
>    <https://github.com/OSGeo/grass/pull/1771>
>    - r.slope.aspect - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1767
>    <https://github.com/OSGeo/grass/pull/1767>
>    - r.series - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1776
>    <https://github.com/OSGeo/grass/pull/1776>
>    - r.patch - https://github.com/OSGeo/grass/pull/
>    <https://github.com/OSGeo/grass/pull/1634>1782
>    <https://github.com/OSGeo/grass/pull/1782>
>
> Firstly, I have greatly underestimated the complexity of the work. Up to
> 20 modules were initially proposed at first but after the second week.
> However, it became clear that we had to cut down on the number of target
> modules and focus more on designing the algorithms. The modules we targeted
> behave differently as compared to some modules that had received OpenMP
> support in the past such as *r.sun*. In particular, the modules need to
> keep the same of behavior of having low memory footprint even after the
> parallelization, unlike *r.sun* which loads the entire raster map
> in-memory.
>
>
> During the first half of the GSoC, with the mentors’ discussion, we have
> come out with three different approaches to introducing parallel support to
> *r.neighbors*. After benchmarking their performance and taking account of
> their memory/disk usage, we decided to settle with the last approach which
> requires us to add an extra parameter *memory* to allow users to adjust
> their memory consumption. With this approach, we have to allow the modules
> to process the raster map by chunks. Once we settled about the design, we
> started applying the same approach to other similar modules with low memory
> footprints.
>
> For more information regarding the implementation, see
> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP.
>
> Furthermore, test scripts were included in the modules to ensure the
> consistency of the results. Benchmark scripts were added to allow users to
> easily benchmark the performance of the parallelization to monitor the
> speedup in their own local machine. User documentations were also modified
> to include sections detailing how to make use of the newly added features.
>
> *Future Work*
>
> In the future, more raster modules can be parallelized using similar
> approach. Then, we can consider tackling more complex modules such as
> *r.watershed* and *r.mapcalc*. Also, we could consider exploring 3D
> raster modules as well.
>
>
> Furthermore, when we implement parallelization for *r.univar*, we notice
> that modules that produce statistics involving arithmetic can often have
> floating point discrepancies when dealing with large summation. Because of
> this, computation using different number of threads will now produce
> different results due to having different order of arithmetic. One idea
> would be to introduce *Kahan Summation algorithm* to reduce the
> floating-point discrepancies. However, this still would not guarantee the
> consistency of results.
>
> *Permanent Links*
>
> For the project overview, please visit the
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
> /.
> For the project timeline and logs, please visit the
> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization.
>
> I would like to huge thanks to Huidae Cho, Vaclav Petras and Māris Nartišs
> for their valuable guidance. And I would like to thank the GRASS community
> for the valuable feedbacks and support. Lastly, I would like to thank the
> GSoC team for this opportunity.
>
> Thanks all!
>
> Warmest regards,
> Aaron Saw Min Sern
>
>
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20210824/aae7d5c2/attachment-0001.html>


More information about the grass-dev mailing list