[GRASS-dev] [SoC] GSoC 2021 - Final Report [Parallelization of raster modules for GRASS GIS]

Moritz Lennert mlennert at club.worldonline.be
Tue Sep 14 11:52:46 PDT 2021


Hi Aaron,

Works wonderfully, thanks ! I've had some issues when using a mask and 
errors related to ZSTD compression, but no idea if it is linked to your 
code, and no time today to follow up.

Moritz

Am 13.09.2021 17:43 schrieb Aaron Saw Min Sern:
> Hi Moritz,
> 
> Yes, you can just apply the PR, either by checking into the branch, or
> on your local machine, you can rebase the branch on top of the current
> main branch (I don't think there will be any conflicts) Let me know if
> there are any issues :)
> 
> Warm regards,
> Aaron
> 
> -------------------------
> 
> FROM: Moritz Lennert <mlennert at club.worldonline.be>
>  SENT: Monday, 13 September 2021, 23:33
>  TO: Aaron Saw Min Sern
>  CC: grass-dev at lists.osgeo.org
>  SUBJECT: Re: [GRASS-dev] [SoC] GSoC 2021 - Final Report
> [Parallelization of raster modules for GRASS GIS]
> 
>  - External Email -
> 
>  Hi Aaron,
> 
>  I have a use case for which I would like to try the parallelized
>  r.neighbors. Do I just have to apply PR 1724 to current main to be
> able
>  to do this, or do I have to do something else ?
> 
>  Moritz
> 
>  Am 25.08.2021 15:59 schrieb Aaron Saw Min Sern:
>  > Hi all,
>  >
>  > Thanks Vero for spotting the mistakes on the links. The formatting
> of
>  > the links must have gone wrong, but here's the links to the
> respective
>  > PR.
>  >
>  > * r.univar - 1634 [1]
>  > * r.neighbors - 1724 [2]
>  > * r.mfilter - 1708 [3]
>  > * r.resamp.filter - 1759 [4]
>  > * r.resamp.interp - 1771 [5]
>  > * r.slope.aspect - 1767 [6]
>  > * r.series - 1776 [7]
>  > * r.patch - 1782 [8]
>  >
>  > I will still be working on getting the checklists completed in the
>  > next few weeks.
>  >
>  > Warmest regards,
>  > Aaron
>  >
>  > -------------------------
>  >
>  > FROM: Veronica Andreo <veroandreo at gmail.com>
>  > SENT: Tuesday, August 24, 2021 8:48 PM
>  > TO: Aaron Saw Min Sern <aaronsms at u.nus.edu>
>  > CC: soc at lists.osgeo.org <soc at lists.osgeo.org>;
>  > grass-dev at lists.osgeo.org <grass-dev at lists.osgeo.org>
>  > SUBJECT: Re: [GRASS-dev] [SoC] GSoC 2021 - Final Report
>  > [Parallelization of raster modules for GRASS GIS]
>  >
>  > - External Email -
>  >
>  > Dear Aaron,
>  >
>  > Thanks for your report and all your work to make GRASS raster
> modules
>  > run faster, we like that :) Huge thanks as well to Vaclav, Huidae
> and
>  > Maris for their commitment to the project and mentoring! You all
> did a
>  > great team work!
>  >
>  > One question: are these changes planned to be merged before the
>  > creation of grass 8 branch or will they remain for 8+? Maybe add a
>  > milestone to clarify?
>  >
>  > One minor observation: all links currently point to the same PR
>  >
>  > We hope to keep seeing you around, Aaron!
>  >
>  > All the best,
>  > Vero
>  >
>  > El mar, 24 ago 2021 a las 3:18, Aaron Saw Min Sern
>  > (<aaronsms at u.nus.edu>) escribió:
>  >
>  >> Hello everyone,
>  >>
>  >> Here is my final report for GSoC 2021 project, _Parallelization of
>  >> Raster Modules for GRASS GIS_.
>  >>
>  >> ABSTRACT
>  >> The goal of this project is to introduce parallelization to
>  >> existing raster modules in GRASS GIS using OpenMP. This will allow
>  >> users to take advantage of more cores in their hardware to speed
> up
>  >> the computation time especially for large raster files with large
>  >> computation cost. The key challenge of this project is to separate
>  >> the parallelizable components from the sequential part of the
>  >> modules without introducing too much overhead in terms of memory,
>  >> disk or computation resources.
>  >>
>  >> MILESTONES
>  >>
>  >> In total, I have introduced OpenMP support to 8 raster modules in
>  >> GRASS GIS. The pull requests to each module are as follows:
>  >>
>  >> * r.univar - https://github.com/OSGeo/grass/pull/1634 [1] [1]
>  >> * r.neighbors - https://github.com/OSGeo/grass/pull/ [2] [1]1724
> [2]
>  >> * r.mfilter - https://github.com/OSGeo/grass/pull/ [2] [1]1708 [3]
>  >> * r.resamp.filter - https://github.com/OSGeo/grass/pull/ [2]
> [1]1759
>  >> [4]
>  >> * r.resamp.interp - https://github.com/OSGeo/grass/pull/ [2]
> [1]1771
>  >> [5]
>  >> * r.slope.aspect - https://github.com/OSGeo/grass/pull/ [2]
> [1]1767 [6]
>  >> * r.series - https://github.com/OSGeo/grass/pull/ [2] [1]1776 [7]
>  >> * r.patch - https://github.com/OSGeo/grass/pull/ [2] [1]1782 [8]
>  >>
>  >> Firstly, I have greatly underestimated the complexity of the work.
>  >> Up to 20 modules were initially proposed at first but after the
>  >> second week. However, it became clear that we had to cut down on
> the
>  >> number of target modules and focus more on designing the
> algorithms.
>  >> The modules we targeted behave differently as compared to some
>  >> modules that had received OpenMP support in the past such as
>  >> _r.sun_. In particular, the modules need to keep the same of
>  >> behavior of having low memory footprint even after the
>  >> parallelization, unlike _r.sun_ which loads the entire raster map
>  >> in-memory.
>  >>
>  >> During the first half of the GSoC, with the mentors’ discussion,
>  >> we have come out with three different approaches to introducing
>  >> parallel support to _r.neighbors_. After benchmarking their
>  >> performance and taking account of their memory/disk usage, we
>  >> decided to settle with the last approach which requires us to add
> an
>  >> extra parameter _memory_ to allow users to adjust their memory
>  >> consumption. With this approach, we have to allow the modules to
>  >> process the raster map by chunks. Once we settled about the
> design,
>  >> we started applying the same approach to other similar modules
> with
>  >> low memory footprints.
>  >>
>  >> For more information regarding the implementation, see
>  >>
> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [3]
>  >> [9].
>  >>
>  >> Furthermore, test scripts were included in the modules to ensure
> the
>  >> consistency of the results. Benchmark scripts were added to allow
>  >> users to easily benchmark the performance of the parallelization
> to
>  >> monitor the speedup in their own local machine. User
> documentations
>  >> were also modified to include sections detailing how to make use
> of
>  >> the newly added features.
>  >>
>  >> FUTURE WORK
>  >>
>  >> In the future, more raster modules can be parallelized using
> similar
>  >> approach. Then, we can consider tackling more complex modules such
>  >> as _r.watershed_ and _r.mapcalc_. Also, we could consider
> exploring
>  >> 3D raster modules as well.
>  >>
>  >> Furthermore, when we implement parallelization for _r.univar_, we
>  >> notice that modules that produce statistics involving arithmetic
> can
>  >> often have floating point discrepancies when dealing with large
>  >> summation. Because of this, computation using different number of
>  >> threads will now produce different results due to having different
>  >> order of arithmetic. One idea would be to introduce _Kahan
> Summation
>  >> algorithm_ to reduce the floating-point discrepancies. However,
> this
>  >> still would not guarantee the consistency of results.
>  >>
>  >> PERMANENT LINKS
>  >>
>  >> For the project overview, please visit the
>  >>
>  >
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
> [4]
>  >> [10]/.
>  >> For the project timeline and logs, please visit the
>  >> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
> [5]
>  >> [11].
>  >>
>  >> I would like to huge thanks to Huidae Cho, Vaclav Petras and
> Māris
>  >> Nartišs for their valuable guidance. And I would like to thank
> the
>  >> GRASS community for the valuable feedbacks and support. Lastly, I
>  >> would like to thank the GSoC team for this opportunity.
>  >>
>  >> Thanks all!
>  >>
>  >> Warmest regards,
>  >> Aaron Saw Min Sern
>  >>
>  >> _______________________________________________
>  >> grass-dev mailing list
>  >> grass-dev at lists.osgeo.org
>  >> https://lists.osgeo.org/mailman/listinfo/grass-dev [6] [12]
>  >
>  >
>  > Links:
>  > ------
>  > [1] https://github.com/OSGeo/grass/pull/1634 [1]
>  > [2] https://github.com/OSGeo/grass/pull/1724 [7]
>  > [3] https://github.com/OSGeo/grass/pull/1708 [8]
>  > [4] https://github.com/OSGeo/grass/pull/1759 [9]
>  > [5] https://github.com/OSGeo/grass/pull/1771 [10]
>  > [6] https://github.com/OSGeo/grass/pull/1767 [11]
>  > [7] https://github.com/OSGeo/grass/pull/1776 [12]
>  > [8] https://github.com/OSGeo/grass/pull/1782 [13]
>  > [9]
> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [3]
>  > [10]
>  >
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview/
> [14]
>  > [11]
> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization [5]
>  > [12] https://lists.osgeo.org/mailman/listinfo/grass-dev [6]
>  >
>  > _______________________________________________
>  > grass-dev mailing list
>  > grass-dev at lists.osgeo.org
>  > https://lists.osgeo.org/mailman/listinfo/grass-dev [6]
> 
> 
> 
> Links:
> ------
> [1] https://github.com/OSGeo/grass/pull/1634
> [2] https://github.com/OSGeo/grass/pull/
> [3] https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [4]
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
> [5] https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
> [6] https://lists.osgeo.org/mailman/listinfo/grass-dev
> [7] https://github.com/OSGeo/grass/pull/1724
> [8] https://github.com/OSGeo/grass/pull/1708
> [9] https://github.com/OSGeo/grass/pull/1759
> [10] https://github.com/OSGeo/grass/pull/1771
> [11] https://github.com/OSGeo/grass/pull/1767
> [12] https://github.com/OSGeo/grass/pull/1776
> [13] https://github.com/OSGeo/grass/pull/1782
> [14]
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview/



More information about the grass-dev mailing list