[GRASS-user] Global overview of GRASS HPC resources/installations ?

Massi Alvioli nocharge at gmail.com
Thu May 24 04:12:54 PDT 2018


even if you can tile-up your problem - which probably covers 95% of
the parallelization
one can do in GRASS - you still have the problem instantiate the the
cloud machines,
copying data to the multiple instances, gather back the results and
patching them into
your final results - the order of last two steps is your choice - and
I expect all of these
operations to be much slower going through the cloud than in any other
architecture.
The overall processing time is what matters, from importing initial
data to having the
final result available. Of course, if the cloud is the only viable
possibility of having
multiple cores, there is no way out. It is also true that everybody
owns a couple of
desktop machines with a few tens of computing cores overall ..


M


2018-05-24 9:11 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
> Hi Massi,
> using multiple single instances of GRASS had advantages (in our workflow)
> when tiling: each tile was sent in its own mapset to a different instance
> for processing.
> I am aware that this can be done on HPC locally. However, doing this on the
> cloud had the advantage (for us) to be able to use many more instances than
> the cores available locally.
>
> I think you are right and I/O operation and concurrent database operations
> will be probably slower, but our workflow focus mainly on raster operations
> and integrated GRASS / R models. If these operations can be tiled, then
> there are advantages in doing so on different instances, when one does not
> have access to enough local cores.
>
> I am trying to tidy up the workflow used to be able to share. And I am
> looking forward to see other workflows.
>
> Thanks
>
> Laura
>
> On 23 May 2018 at 21:08, Massi Alvioli <nocharge at gmail.com> wrote:
>>
>> Hi Laura,
>>
>> well, not actually - it does not answer my question. I mean, I am
>> pretty sure one can have GRASS up and running on some cloud instance,
>> but the point is: when it comes to performance, is that convenient? I
>> mean multi-process performance, of course. There is not much point on
>> running single GRASS instances, if not for very peculiar applications,
>> right? I bet it is not convenient, on any level, either if we look at
>> I/O operations, or mapcalc operations, not to talk about concurrent
>> database operations ... I might be wrong, of course. But my experience
>> with cloud environments and parallel processing were rather
>> disappointing. On some un-related problem (I mean, not GRASS-related),
>> I tried something here https://doi.org/10.30437/ogrs2016_paper_08,
>> with little success. I can't imagine a reason why it should be
>> different using GRASS modules, while I found undoubtfully good
>> performance on HPC machines.
>>
>> M
>>
>> 2018-05-23 16:35 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
>> > Hi Massi,
>> > we managed to run GRASS on different single-core instances on a cloud
>> > provider. It was a bit tricky (initially) to set up the NFS mount
>> > points. I
>> > am still exploring the different types of storage possible and what
>> > would be
>> > cheaper and more efficient.
>> >
>> > I hope this answers your question.
>> >
>> > Once the workflow is more stable I hope I will be able to share it more
>> > widely.
>> >
>> > Thanks
>> >
>> > Laura
>> >
>> > On 23 May 2018 at 14:37, Massi Alvioli <nocharge at gmail.com> wrote:
>> >>
>> >> Hi Laura,
>> >>
>> >> the effort on cloud providers is probably useless. Was it different in
>> >> your case?
>> >>
>> >>
>> >> M
>> >>
>> >> 2018-05-22 10:12 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
>> >> > I am really interested in this. I am experimenting with different
>> >> > settings
>> >> > to use GRASS on HPC, more specifically on multi-core local machines
>> >> > and
>> >> > on
>> >> > single-core multiple instances on a cloud provider. It would be great
>> >> > to
>> >> > share experiences with other people fighting the same problems.
>> >> >
>> >> > Thanks
>> >> >
>> >> > Laura
>> >> >
>> >> > On 20 May 2018 at 12:32, Moritz Lennert
>> >> > <mlennert at club.worldonline.be>
>> >> > wrote:
>> >> >>
>> >> >> Le Sun, 20 May 2018 09:30:53 +0200,
>> >> >> Nikos Alexandris <nik at nikosalexandris.net> a écrit :
>> >> >>
>> >> >> > * Massi Alvioli <nocharge at gmail.com> [2018-05-17 15:01:39 +0200]:
>> >> >> >
>> >> >> > >2018-05-17 10:09 GMT+02:00 Moritz Lennert
>> >> >> > ><mlennert at club.worldonline.be>:
>> >> >> > >
>> >> >> > >Hi,
>> >> >> > >
>> >> >> > >> [I imagine your mail was supposed to go onto the mailing list
>> >> >> > >> and
>> >> >> > >> not just to me...]
>> >> >> > >
>> >> >> > >sure my answer was for everyone to read, I believe I tried to
>> >> >> > > send
>> >> >> > > it
>> >> >> > >again afterwards..
>> >> >> > >something must have gone wrong.
>> >> >> > >
>> >> >> > >> I just presented GRASS and a short overview over GRASS on HPC
>> >> >> > >> yesterday at the FOSS4F-FR and there was a lot of interest for
>> >> >> > >> this. Several people asked me about specific documentation on
>> >> >> > >> the
>> >> >> > >> subject.
>> >> >> > >
>> >> >> > >What we did about GRASS + HPC was for specific production
>> >> >> > > purposes
>> >> >> > >and no documentation
>> >> >> > >whatsoever wascreated, basically due to lack of time.. so I find
>> >> >> > > it
>> >> >> > >hard to say whether this is going
>> >> >> > >to change in the near future:). Surely the topic is of wide
>> >> >> > > interest
>> >> >> > >and worth being discussed in
>> >> >> > >several contexts.
>> >> >> > >
>> >> >> > >> Currently, I'm aware of the following wiki pages which each
>> >> >> > >> potentially touches on some aspects of HPC:
>> >> >> > >
>> >> >> > >I must admit that existing documentation/papers did not help
>> >> >> > > much.
>> >> >> > >Well, did not help at all, actually.
>> >> >> > >One major problem in my opinion/experience is that
>> >> >> > >multi-core/multi-node machines can be really
>> >> >> > >different from each other, and parallelization strategies very
>> >> >> > >purpose-specific, so that creating
>> >> >> > >general-purpose documents/papers, or even software, *may* be a
>> >> >> > >hopeless effort. Smart ideas
>> >> >> > >are most welcome, of course:)
>> >> >> >
>> >> >> > Dear Massimo and all,
>> >> >> >
>> >> >> > Being a beginner in massively processing Landsat 8 images using
>> >> >> > JRC's
>> >> >> > JEODPP system (which is designed for High-Throughput,
>> >> >> > https://doi.org/10.1016/j.future.2017.11.007), I found useful
>> >> >> > notes
>> >> >> > in
>> >> >> > the Wiki (notably Veronica's excellent tutorials) and elsewhere,
>> >> >> > got
>> >> >> > specific answers through the mailing lists and learned a lot in
>> >> >> > on-site discussions during the last OSGeo sprint, for example.
>> >> >> >
>> >> >> > Nonetheless, I think to have learned quite some things the hard
>> >> >> > way.
>> >> >> > In this regard, some answers to even "non-sense" questions are
>> >> >> > worth
>> >> >> > documenting.
>> >> >> >
>> >> >> > My aim is to transfer notes of practical value. Having HPC and HTC
>> >> >> > related notes in a wiki, will help to get started, promote best
>> >> >> > practices, learn through common mistakes and give an overview for
>> >> >> > the
>> >> >> > points Peter put in this thread's first message.
>> >> >>
>> >> >> +1
>> >> >>
>> >> >> >
>> >> >> > I hope it's fine to name the page "High Performance Computing".
>> >> >> > Please
>> >> >> > advise or create a page with another name if you think otherwise.
>> >> >>
>> >> >>
>> >> >> +1
>> >> >>
>> >> >> Moritz
>> >> >> _______________________________________________
>> >> >> grass-user mailing list
>> >> >> grass-user at lists.osgeo.org
>> >> >> https://lists.osgeo.org/mailman/listinfo/grass-user
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > grass-user mailing list
>> >> > grass-user at lists.osgeo.org
>> >> > https://lists.osgeo.org/mailman/listinfo/grass-user
>> >
>> >
>
>


More information about the grass-user mailing list