[GRASS-dev] [SoC] GSoC 2021 - Final Report [Parallelization of raster modules for GRASS GIS]

Moritz Lennert mlennert at club.worldonline.be
Mon Sep 13 08:32:59 PDT 2021


Hi Aaron,

I have a use case for which I would like to try the parallelized 
r.neighbors. Do I just have to apply PR 1724 to current main to be able 
to do this, or do I have to do something else ?

Moritz

Am 25.08.2021 15:59 schrieb Aaron Saw Min Sern:
> Hi all,
> 
>  Thanks Vero for spotting the mistakes on the links. The formatting of
> the links must have gone wrong, but here's the links to the respective
> PR.
> 
>  	* r.univar - 1634 [1]
> 	* r.neighbors - 1724 [2]
> 	* r.mfilter - 1708 [3]
> 	* r.resamp.filter - 1759 [4]
> 	* r.resamp.interp - 1771 [5]
> 	* r.slope.aspect - 1767 [6]
> 	* r.series - 1776 [7]
> 	* r.patch - 1782 [8]
> 
>  I will still be working on getting the checklists completed in the
> next few weeks.
> 
>  Warmest regards,
>  Aaron
> 
> -------------------------
> 
> FROM: Veronica Andreo <veroandreo at gmail.com>
>  SENT: Tuesday, August 24, 2021 8:48 PM
>  TO: Aaron Saw Min Sern <aaronsms at u.nus.edu>
>  CC: soc at lists.osgeo.org <soc at lists.osgeo.org>;
> grass-dev at lists.osgeo.org <grass-dev at lists.osgeo.org>
>  SUBJECT: Re: [GRASS-dev] [SoC] GSoC 2021 - Final Report
> [Parallelization of raster modules for GRASS GIS]
> 
>  		- External Email -
> 
> Dear Aaron,
> 
> Thanks for your report and all your work to make GRASS raster modules
> run faster, we like that :) Huge thanks as well to Vaclav, Huidae and
> Maris for their commitment to the project and mentoring! You all did a
> great team work!
> 
> One question: are these changes planned to be merged before the
> creation of grass 8 branch or will they remain for 8+? Maybe add a
> milestone to clarify?
> 
> One minor observation: all links currently point to the same PR
> 
> We hope to keep seeing you around, Aaron!
> 
> All the best,
> Vero
> 
> El mar, 24 ago 2021 a las 3:18, Aaron Saw Min Sern
> (<aaronsms at u.nus.edu>) escribió:
> 
>> Hello everyone,
>> 
>> Here is my final report for GSoC 2021 project, _Parallelization of
>> Raster Modules for GRASS GIS_.
>> 
>> ABSTRACT
>> The goal of this project is to introduce parallelization to
>> existing raster modules in GRASS GIS using OpenMP. This will allow
>> users to take advantage of more cores in their hardware to speed up
>> the computation time especially for large raster files with large
>> computation cost. The key challenge of this project is to separate
>> the parallelizable components from the sequential part of the
>> modules without introducing too much overhead in terms of memory,
>> disk or computation resources.
>> 
>> MILESTONES
>> 
>> In total, I have introduced OpenMP support to 8 raster modules in
>> GRASS GIS. The pull requests to each module are as follows:
>> 
>> * r.univar - https://github.com/OSGeo/grass/pull/1634 [1]
>> * r.neighbors - https://github.com/OSGeo/grass/pull/ [1]1724 [2]
>> * r.mfilter - https://github.com/OSGeo/grass/pull/ [1]1708 [3]
>> * r.resamp.filter - https://github.com/OSGeo/grass/pull/ [1]1759
>> [4]
>> * r.resamp.interp - https://github.com/OSGeo/grass/pull/ [1]1771
>> [5]
>> * r.slope.aspect - https://github.com/OSGeo/grass/pull/ [1]1767 [6]
>> * r.series - https://github.com/OSGeo/grass/pull/ [1]1776 [7]
>> * r.patch - https://github.com/OSGeo/grass/pull/ [1]1782 [8]
>> 
>> Firstly, I have greatly underestimated the complexity of the work.
>> Up to 20 modules were initially proposed at first but after the
>> second week. However, it became clear that we had to cut down on the
>> number of target modules and focus more on designing the algorithms.
>> The modules we targeted behave differently as compared to some
>> modules that had received OpenMP support in the past such as
>> _r.sun_. In particular, the modules need to keep the same of
>> behavior of having low memory footprint even after the
>> parallelization, unlike _r.sun_ which loads the entire raster map
>> in-memory.
>> 
>> During the first half of the GSoC, with the mentors’ discussion,
>> we have come out with three different approaches to introducing
>> parallel support to _r.neighbors_. After benchmarking their
>> performance and taking account of their memory/disk usage, we
>> decided to settle with the last approach which requires us to add an
>> extra parameter _memory_ to allow users to adjust their memory
>> consumption. With this approach, we have to allow the modules to
>> process the raster map by chunks. Once we settled about the design,
>> we started applying the same approach to other similar modules with
>> low memory footprints.
>> 
>> For more information regarding the implementation, see
>> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
>> [9].
>> 
>> Furthermore, test scripts were included in the modules to ensure the
>> consistency of the results. Benchmark scripts were added to allow
>> users to easily benchmark the performance of the parallelization to
>> monitor the speedup in their own local machine. User documentations
>> were also modified to include sections detailing how to make use of
>> the newly added features.
>> 
>> FUTURE WORK
>> 
>> In the future, more raster modules can be parallelized using similar
>> approach. Then, we can consider tackling more complex modules such
>> as _r.watershed_ and _r.mapcalc_. Also, we could consider exploring
>> 3D raster modules as well.
>> 
>> Furthermore, when we implement parallelization for _r.univar_, we
>> notice that modules that produce statistics involving arithmetic can
>> often have floating point discrepancies when dealing with large
>> summation. Because of this, computation using different number of
>> threads will now produce different results due to having different
>> order of arithmetic. One idea would be to introduce _Kahan Summation
>> algorithm_ to reduce the floating-point discrepancies. However, this
>> still would not guarantee the consistency of results.
>> 
>> PERMANENT LINKS
>> 
>> For the project overview, please visit the
>> 
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
>> [10]/.
>> For the project timeline and logs, please visit the
>> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
>> [11].
>> 
>> I would like to huge thanks to Huidae Cho, Vaclav Petras and Māris
>> Nartišs for their valuable guidance. And I would like to thank the
>> GRASS community for the valuable feedbacks and support. Lastly, I
>> would like to thank the GSoC team for this opportunity.
>> 
>> Thanks all!
>> 
>> Warmest regards,
>> Aaron Saw Min Sern
>> 
>> _______________________________________________
>> grass-dev mailing list
>> grass-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/grass-dev [12]
> 
> 
> Links:
> ------
> [1] https://github.com/OSGeo/grass/pull/1634
> [2] https://github.com/OSGeo/grass/pull/1724
> [3] https://github.com/OSGeo/grass/pull/1708
> [4] https://github.com/OSGeo/grass/pull/1759
> [5] https://github.com/OSGeo/grass/pull/1771
> [6] https://github.com/OSGeo/grass/pull/1767
> [7] https://github.com/OSGeo/grass/pull/1776
> [8] https://github.com/OSGeo/grass/pull/1782
> [9] https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [10]
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview/
> [11] https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
> [12] https://lists.osgeo.org/mailman/listinfo/grass-dev
> 
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/grass-dev



More information about the grass-dev mailing list