[GRASS-dev] [SoC] GSoC 2021 - Final Report [Parallelization of raster modules for GRASS GIS]
Moritz Lennert
mlennert at club.worldonline.be
Mon Sep 13 08:32:59 PDT 2021
Hi Aaron,
I have a use case for which I would like to try the parallelized
r.neighbors. Do I just have to apply PR 1724 to current main to be able
to do this, or do I have to do something else ?
Moritz
Am 25.08.2021 15:59 schrieb Aaron Saw Min Sern:
> Hi all,
>
> Thanks Vero for spotting the mistakes on the links. The formatting of
> the links must have gone wrong, but here's the links to the respective
> PR.
>
> * r.univar - 1634 [1]
> * r.neighbors - 1724 [2]
> * r.mfilter - 1708 [3]
> * r.resamp.filter - 1759 [4]
> * r.resamp.interp - 1771 [5]
> * r.slope.aspect - 1767 [6]
> * r.series - 1776 [7]
> * r.patch - 1782 [8]
>
> I will still be working on getting the checklists completed in the
> next few weeks.
>
> Warmest regards,
> Aaron
>
> -------------------------
>
> FROM: Veronica Andreo <veroandreo at gmail.com>
> SENT: Tuesday, August 24, 2021 8:48 PM
> TO: Aaron Saw Min Sern <aaronsms at u.nus.edu>
> CC: soc at lists.osgeo.org <soc at lists.osgeo.org>;
> grass-dev at lists.osgeo.org <grass-dev at lists.osgeo.org>
> SUBJECT: Re: [GRASS-dev] [SoC] GSoC 2021 - Final Report
> [Parallelization of raster modules for GRASS GIS]
>
> - External Email -
>
> Dear Aaron,
>
> Thanks for your report and all your work to make GRASS raster modules
> run faster, we like that :) Huge thanks as well to Vaclav, Huidae and
> Maris for their commitment to the project and mentoring! You all did a
> great team work!
>
> One question: are these changes planned to be merged before the
> creation of grass 8 branch or will they remain for 8+? Maybe add a
> milestone to clarify?
>
> One minor observation: all links currently point to the same PR
>
> We hope to keep seeing you around, Aaron!
>
> All the best,
> Vero
>
> El mar, 24 ago 2021 a las 3:18, Aaron Saw Min Sern
> (<aaronsms at u.nus.edu>) escribió:
>
>> Hello everyone,
>>
>> Here is my final report for GSoC 2021 project, _Parallelization of
>> Raster Modules for GRASS GIS_.
>>
>> ABSTRACT
>> The goal of this project is to introduce parallelization to
>> existing raster modules in GRASS GIS using OpenMP. This will allow
>> users to take advantage of more cores in their hardware to speed up
>> the computation time especially for large raster files with large
>> computation cost. The key challenge of this project is to separate
>> the parallelizable components from the sequential part of the
>> modules without introducing too much overhead in terms of memory,
>> disk or computation resources.
>>
>> MILESTONES
>>
>> In total, I have introduced OpenMP support to 8 raster modules in
>> GRASS GIS. The pull requests to each module are as follows:
>>
>> * r.univar - https://github.com/OSGeo/grass/pull/1634 [1]
>> * r.neighbors - https://github.com/OSGeo/grass/pull/ [1]1724 [2]
>> * r.mfilter - https://github.com/OSGeo/grass/pull/ [1]1708 [3]
>> * r.resamp.filter - https://github.com/OSGeo/grass/pull/ [1]1759
>> [4]
>> * r.resamp.interp - https://github.com/OSGeo/grass/pull/ [1]1771
>> [5]
>> * r.slope.aspect - https://github.com/OSGeo/grass/pull/ [1]1767 [6]
>> * r.series - https://github.com/OSGeo/grass/pull/ [1]1776 [7]
>> * r.patch - https://github.com/OSGeo/grass/pull/ [1]1782 [8]
>>
>> Firstly, I have greatly underestimated the complexity of the work.
>> Up to 20 modules were initially proposed at first but after the
>> second week. However, it became clear that we had to cut down on the
>> number of target modules and focus more on designing the algorithms.
>> The modules we targeted behave differently as compared to some
>> modules that had received OpenMP support in the past such as
>> _r.sun_. In particular, the modules need to keep the same of
>> behavior of having low memory footprint even after the
>> parallelization, unlike _r.sun_ which loads the entire raster map
>> in-memory.
>>
>> During the first half of the GSoC, with the mentors’ discussion,
>> we have come out with three different approaches to introducing
>> parallel support to _r.neighbors_. After benchmarking their
>> performance and taking account of their memory/disk usage, we
>> decided to settle with the last approach which requires us to add an
>> extra parameter _memory_ to allow users to adjust their memory
>> consumption. With this approach, we have to allow the modules to
>> process the raster map by chunks. Once we settled about the design,
>> we started applying the same approach to other similar modules with
>> low memory footprints.
>>
>> For more information regarding the implementation, see
>> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
>> [9].
>>
>> Furthermore, test scripts were included in the modules to ensure the
>> consistency of the results. Benchmark scripts were added to allow
>> users to easily benchmark the performance of the parallelization to
>> monitor the speedup in their own local machine. User documentations
>> were also modified to include sections detailing how to make use of
>> the newly added features.
>>
>> FUTURE WORK
>>
>> In the future, more raster modules can be parallelized using similar
>> approach. Then, we can consider tackling more complex modules such
>> as _r.watershed_ and _r.mapcalc_. Also, we could consider exploring
>> 3D raster modules as well.
>>
>> Furthermore, when we implement parallelization for _r.univar_, we
>> notice that modules that produce statistics involving arithmetic can
>> often have floating point discrepancies when dealing with large
>> summation. Because of this, computation using different number of
>> threads will now produce different results due to having different
>> order of arithmetic. One idea would be to introduce _Kahan Summation
>> algorithm_ to reduce the floating-point discrepancies. However, this
>> still would not guarantee the consistency of results.
>>
>> PERMANENT LINKS
>>
>> For the project overview, please visit the
>>
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
>> [10]/.
>> For the project timeline and logs, please visit the
>> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
>> [11].
>>
>> I would like to huge thanks to Huidae Cho, Vaclav Petras and Māris
>> Nartišs for their valuable guidance. And I would like to thank the
>> GRASS community for the valuable feedbacks and support. Lastly, I
>> would like to thank the GSoC team for this opportunity.
>>
>> Thanks all!
>>
>> Warmest regards,
>> Aaron Saw Min Sern
>>
>> _______________________________________________
>> grass-dev mailing list
>> grass-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/grass-dev [12]
>
>
> Links:
> ------
> [1] https://github.com/OSGeo/grass/pull/1634
> [2] https://github.com/OSGeo/grass/pull/1724
> [3] https://github.com/OSGeo/grass/pull/1708
> [4] https://github.com/OSGeo/grass/pull/1759
> [5] https://github.com/OSGeo/grass/pull/1771
> [6] https://github.com/OSGeo/grass/pull/1767
> [7] https://github.com/OSGeo/grass/pull/1776
> [8] https://github.com/OSGeo/grass/pull/1782
> [9] https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [10]
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview/
> [11] https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
> [12] https://lists.osgeo.org/mailman/listinfo/grass-dev
>
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-dev
More information about the grass-dev
mailing list