[GRASS-dev] [SoC] GSoC 2021 - Final Report [Parallelization of raster modules for GRASS GIS]
Moritz Lennert
mlennert at club.worldonline.be
Tue Sep 14 11:52:46 PDT 2021
Hi Aaron,
Works wonderfully, thanks ! I've had some issues when using a mask and
errors related to ZSTD compression, but no idea if it is linked to your
code, and no time today to follow up.
Moritz
Am 13.09.2021 17:43 schrieb Aaron Saw Min Sern:
> Hi Moritz,
>
> Yes, you can just apply the PR, either by checking into the branch, or
> on your local machine, you can rebase the branch on top of the current
> main branch (I don't think there will be any conflicts) Let me know if
> there are any issues :)
>
> Warm regards,
> Aaron
>
> -------------------------
>
> FROM: Moritz Lennert <mlennert at club.worldonline.be>
> SENT: Monday, 13 September 2021, 23:33
> TO: Aaron Saw Min Sern
> CC: grass-dev at lists.osgeo.org
> SUBJECT: Re: [GRASS-dev] [SoC] GSoC 2021 - Final Report
> [Parallelization of raster modules for GRASS GIS]
>
> - External Email -
>
> Hi Aaron,
>
> I have a use case for which I would like to try the parallelized
> r.neighbors. Do I just have to apply PR 1724 to current main to be
> able
> to do this, or do I have to do something else ?
>
> Moritz
>
> Am 25.08.2021 15:59 schrieb Aaron Saw Min Sern:
> > Hi all,
> >
> > Thanks Vero for spotting the mistakes on the links. The formatting
> of
> > the links must have gone wrong, but here's the links to the
> respective
> > PR.
> >
> > * r.univar - 1634 [1]
> > * r.neighbors - 1724 [2]
> > * r.mfilter - 1708 [3]
> > * r.resamp.filter - 1759 [4]
> > * r.resamp.interp - 1771 [5]
> > * r.slope.aspect - 1767 [6]
> > * r.series - 1776 [7]
> > * r.patch - 1782 [8]
> >
> > I will still be working on getting the checklists completed in the
> > next few weeks.
> >
> > Warmest regards,
> > Aaron
> >
> > -------------------------
> >
> > FROM: Veronica Andreo <veroandreo at gmail.com>
> > SENT: Tuesday, August 24, 2021 8:48 PM
> > TO: Aaron Saw Min Sern <aaronsms at u.nus.edu>
> > CC: soc at lists.osgeo.org <soc at lists.osgeo.org>;
> > grass-dev at lists.osgeo.org <grass-dev at lists.osgeo.org>
> > SUBJECT: Re: [GRASS-dev] [SoC] GSoC 2021 - Final Report
> > [Parallelization of raster modules for GRASS GIS]
> >
> > - External Email -
> >
> > Dear Aaron,
> >
> > Thanks for your report and all your work to make GRASS raster
> modules
> > run faster, we like that :) Huge thanks as well to Vaclav, Huidae
> and
> > Maris for their commitment to the project and mentoring! You all
> did a
> > great team work!
> >
> > One question: are these changes planned to be merged before the
> > creation of grass 8 branch or will they remain for 8+? Maybe add a
> > milestone to clarify?
> >
> > One minor observation: all links currently point to the same PR
> >
> > We hope to keep seeing you around, Aaron!
> >
> > All the best,
> > Vero
> >
> > El mar, 24 ago 2021 a las 3:18, Aaron Saw Min Sern
> > (<aaronsms at u.nus.edu>) escribió:
> >
> >> Hello everyone,
> >>
> >> Here is my final report for GSoC 2021 project, _Parallelization of
> >> Raster Modules for GRASS GIS_.
> >>
> >> ABSTRACT
> >> The goal of this project is to introduce parallelization to
> >> existing raster modules in GRASS GIS using OpenMP. This will allow
> >> users to take advantage of more cores in their hardware to speed
> up
> >> the computation time especially for large raster files with large
> >> computation cost. The key challenge of this project is to separate
> >> the parallelizable components from the sequential part of the
> >> modules without introducing too much overhead in terms of memory,
> >> disk or computation resources.
> >>
> >> MILESTONES
> >>
> >> In total, I have introduced OpenMP support to 8 raster modules in
> >> GRASS GIS. The pull requests to each module are as follows:
> >>
> >> * r.univar - https://github.com/OSGeo/grass/pull/1634 [1] [1]
> >> * r.neighbors - https://github.com/OSGeo/grass/pull/ [2] [1]1724
> [2]
> >> * r.mfilter - https://github.com/OSGeo/grass/pull/ [2] [1]1708 [3]
> >> * r.resamp.filter - https://github.com/OSGeo/grass/pull/ [2]
> [1]1759
> >> [4]
> >> * r.resamp.interp - https://github.com/OSGeo/grass/pull/ [2]
> [1]1771
> >> [5]
> >> * r.slope.aspect - https://github.com/OSGeo/grass/pull/ [2]
> [1]1767 [6]
> >> * r.series - https://github.com/OSGeo/grass/pull/ [2] [1]1776 [7]
> >> * r.patch - https://github.com/OSGeo/grass/pull/ [2] [1]1782 [8]
> >>
> >> Firstly, I have greatly underestimated the complexity of the work.
> >> Up to 20 modules were initially proposed at first but after the
> >> second week. However, it became clear that we had to cut down on
> the
> >> number of target modules and focus more on designing the
> algorithms.
> >> The modules we targeted behave differently as compared to some
> >> modules that had received OpenMP support in the past such as
> >> _r.sun_. In particular, the modules need to keep the same of
> >> behavior of having low memory footprint even after the
> >> parallelization, unlike _r.sun_ which loads the entire raster map
> >> in-memory.
> >>
> >> During the first half of the GSoC, with the mentors’ discussion,
> >> we have come out with three different approaches to introducing
> >> parallel support to _r.neighbors_. After benchmarking their
> >> performance and taking account of their memory/disk usage, we
> >> decided to settle with the last approach which requires us to add
> an
> >> extra parameter _memory_ to allow users to adjust their memory
> >> consumption. With this approach, we have to allow the modules to
> >> process the raster map by chunks. Once we settled about the
> design,
> >> we started applying the same approach to other similar modules
> with
> >> low memory footprints.
> >>
> >> For more information regarding the implementation, see
> >>
> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [3]
> >> [9].
> >>
> >> Furthermore, test scripts were included in the modules to ensure
> the
> >> consistency of the results. Benchmark scripts were added to allow
> >> users to easily benchmark the performance of the parallelization
> to
> >> monitor the speedup in their own local machine. User
> documentations
> >> were also modified to include sections detailing how to make use
> of
> >> the newly added features.
> >>
> >> FUTURE WORK
> >>
> >> In the future, more raster modules can be parallelized using
> similar
> >> approach. Then, we can consider tackling more complex modules such
> >> as _r.watershed_ and _r.mapcalc_. Also, we could consider
> exploring
> >> 3D raster modules as well.
> >>
> >> Furthermore, when we implement parallelization for _r.univar_, we
> >> notice that modules that produce statistics involving arithmetic
> can
> >> often have floating point discrepancies when dealing with large
> >> summation. Because of this, computation using different number of
> >> threads will now produce different results due to having different
> >> order of arithmetic. One idea would be to introduce _Kahan
> Summation
> >> algorithm_ to reduce the floating-point discrepancies. However,
> this
> >> still would not guarantee the consistency of results.
> >>
> >> PERMANENT LINKS
> >>
> >> For the project overview, please visit the
> >>
> >
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
> [4]
> >> [10]/.
> >> For the project timeline and logs, please visit the
> >> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
> [5]
> >> [11].
> >>
> >> I would like to huge thanks to Huidae Cho, Vaclav Petras and
> Māris
> >> Nartišs for their valuable guidance. And I would like to thank
> the
> >> GRASS community for the valuable feedbacks and support. Lastly, I
> >> would like to thank the GSoC team for this opportunity.
> >>
> >> Thanks all!
> >>
> >> Warmest regards,
> >> Aaron Saw Min Sern
> >>
> >> _______________________________________________
> >> grass-dev mailing list
> >> grass-dev at lists.osgeo.org
> >> https://lists.osgeo.org/mailman/listinfo/grass-dev [6] [12]
> >
> >
> > Links:
> > ------
> > [1] https://github.com/OSGeo/grass/pull/1634 [1]
> > [2] https://github.com/OSGeo/grass/pull/1724 [7]
> > [3] https://github.com/OSGeo/grass/pull/1708 [8]
> > [4] https://github.com/OSGeo/grass/pull/1759 [9]
> > [5] https://github.com/OSGeo/grass/pull/1771 [10]
> > [6] https://github.com/OSGeo/grass/pull/1767 [11]
> > [7] https://github.com/OSGeo/grass/pull/1776 [12]
> > [8] https://github.com/OSGeo/grass/pull/1782 [13]
> > [9]
> https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [3]
> > [10]
> >
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview/
> [14]
> > [11]
> https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization [5]
> > [12] https://lists.osgeo.org/mailman/listinfo/grass-dev [6]
> >
> > _______________________________________________
> > grass-dev mailing list
> > grass-dev at lists.osgeo.org
> > https://lists.osgeo.org/mailman/listinfo/grass-dev [6]
>
>
>
> Links:
> ------
> [1] https://github.com/OSGeo/grass/pull/1634
> [2] https://github.com/OSGeo/grass/pull/
> [3] https://grasswiki.osgeo.org/wiki/Raster_Parallelization_with_OpenMP
> [4]
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview
> [5] https://trac.osgeo.org/grass/wiki/GSoC/2021/RasterParallelization
> [6] https://lists.osgeo.org/mailman/listinfo/grass-dev
> [7] https://github.com/OSGeo/grass/pull/1724
> [8] https://github.com/OSGeo/grass/pull/1708
> [9] https://github.com/OSGeo/grass/pull/1759
> [10] https://github.com/OSGeo/grass/pull/1771
> [11] https://github.com/OSGeo/grass/pull/1767
> [12] https://github.com/OSGeo/grass/pull/1776
> [13] https://github.com/OSGeo/grass/pull/1782
> [14]
> https://summerofcode.withgoogle.com/dashboard/project/6280792767987712/overview/
More information about the grass-dev
mailing list