[GRASS-user] Global overview of GRASS HPC resources/installations ?

Massi Alvioli nocharge at gmail.com
Thu May 24 11:07:16 PDT 2018


in any case, if you have examples in which the cloud is conveniente,
there is a case to write about in a paper, so we should converge on
this one;)


M

2018-05-24 17:40 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
> Hi Massi,
> I can see we live in a quite different "computational" world :-)
> I will try to further answer your questions below. I do agree completely
> that if you have access to a good HPC than cloud providers are probably not
> so needed. If you have a good infrastructure that fits (and even goes
> beyond) your needs and it is well maintained. I also think that from the
> point of view of setting up GRASS the two approaches are not so different.
> And it will be interesting to see the development and comparison of the
> different set-ups.
>
> Laura
>
> On 24 May 2018 at 12:12, Massi Alvioli <nocharge at gmail.com> wrote:
>>
>> even if you can tile-up your problem - which probably covers 95% of
>> the parallelization one can do in GRASS - you still have
>>
>> the problem instantiate the the cloud machines,
>
>
> Scriptable: once the instance template is ready it takes few seconds to
> launch the machines (even hundreds of them)
>
>> copying data to the multiple instances, gather back the results and
>> patching them into your final results - the order of last two steps is
>> your choice - and
>
>
> This is where I am still exploring the most efficient solution. There are
> storage options to avoid or reduce the needs to copy the data to/from the
> instances
>
>>
>> I expect all of these operations to be much slower going through the cloud
>> than in any other
>> architecture.
>>
>> The overall processing time is what matters, from importing initial
>> data to having the final result available.
>
>
> You are right. However I think it depends if the larger data are on own
> premises or online (e.g. remote sensing images). For example for me it is
> much faster to download an image on a online instance, especially when the
> files are already stored on the provider servers.
>
>>
>> Of course, if the cloud is the only viable possibility of having
>> multiple cores, there is no way out. It is also true that everybody
>> owns a couple of desktop machines with a few tens of computing cores
>> overall ..
>>
>
> To set up a cluster with spare desktops you need to follow IT policies and
> sometimes they are not so easy to adapt. In my opinion, it is also not so
> easy  to set up a proper cluster, with shared storage, backup, faster net
> connections, etc.
>
>
>>
>> M
>>
>>
>> 2018-05-24 9:11 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
>> > Hi Massi,
>> > using multiple single instances of GRASS had advantages (in our
>> > workflow)
>> > when tiling: each tile was sent in its own mapset to a different
>> > instance
>> > for processing.
>> > I am aware that this can be done on HPC locally. However, doing this on
>> > the
>> > cloud had the advantage (for us) to be able to use many more instances
>> > than
>> > the cores available locally.
>> >
>> > I think you are right and I/O operation and concurrent database
>> > operations
>> > will be probably slower, but our workflow focus mainly on raster
>> > operations
>> > and integrated GRASS / R models. If these operations can be tiled, then
>> > there are advantages in doing so on different instances, when one does
>> > not
>> > have access to enough local cores.
>> >
>> > I am trying to tidy up the workflow used to be able to share. And I am
>> > looking forward to see other workflows.
>> >
>> > Thanks
>> >
>> > Laura
>> >
>> > On 23 May 2018 at 21:08, Massi Alvioli <nocharge at gmail.com> wrote:
>> >>
>> >> Hi Laura,
>> >>
>> >> well, not actually - it does not answer my question. I mean, I am
>> >> pretty sure one can have GRASS up and running on some cloud instance,
>> >> but the point is: when it comes to performance, is that convenient? I
>> >> mean multi-process performance, of course. There is not much point on
>> >> running single GRASS instances, if not for very peculiar applications,
>> >> right? I bet it is not convenient, on any level, either if we look at
>> >> I/O operations, or mapcalc operations, not to talk about concurrent
>> >> database operations ... I might be wrong, of course. But my experience
>> >> with cloud environments and parallel processing were rather
>> >> disappointing. On some un-related problem (I mean, not GRASS-related),
>> >> I tried something here https://doi.org/10.30437/ogrs2016_paper_08,
>> >> with little success. I can't imagine a reason why it should be
>> >> different using GRASS modules, while I found undoubtfully good
>> >> performance on HPC machines.
>> >>
>> >> M
>> >>
>> >> 2018-05-23 16:35 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
>> >> > Hi Massi,
>> >> > we managed to run GRASS on different single-core instances on a cloud
>> >> > provider. It was a bit tricky (initially) to set up the NFS mount
>> >> > points. I
>> >> > am still exploring the different types of storage possible and what
>> >> > would be
>> >> > cheaper and more efficient.
>> >> >
>> >> > I hope this answers your question.
>> >> >
>> >> > Once the workflow is more stable I hope I will be able to share it
>> >> > more
>> >> > widely.
>> >> >
>> >> > Thanks
>> >> >
>> >> > Laura
>> >> >
>> >> > On 23 May 2018 at 14:37, Massi Alvioli <nocharge at gmail.com> wrote:
>> >> >>
>> >> >> Hi Laura,
>> >> >>
>> >> >> the effort on cloud providers is probably useless. Was it different
>> >> >> in
>> >> >> your case?
>> >> >>
>> >> >>
>> >> >> M
>> >> >>
>> >> >> 2018-05-22 10:12 GMT+02:00 Laura Poggio <laura.poggio at gmail.com>:
>> >> >> > I am really interested in this. I am experimenting with different
>> >> >> > settings
>> >> >> > to use GRASS on HPC, more specifically on multi-core local
>> >> >> > machines
>> >> >> > and
>> >> >> > on
>> >> >> > single-core multiple instances on a cloud provider. It would be
>> >> >> > great
>> >> >> > to
>> >> >> > share experiences with other people fighting the same problems.
>> >> >> >
>> >> >> > Thanks
>> >> >> >
>> >> >> > Laura
>> >> >> >
>> >> >> > On 20 May 2018 at 12:32, Moritz Lennert
>> >> >> > <mlennert at club.worldonline.be>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Le Sun, 20 May 2018 09:30:53 +0200,
>> >> >> >> Nikos Alexandris <nik at nikosalexandris.net> a écrit :
>> >> >> >>
>> >> >> >> > * Massi Alvioli <nocharge at gmail.com> [2018-05-17 15:01:39
>> >> >> >> > +0200]:
>> >> >> >> >
>> >> >> >> > >2018-05-17 10:09 GMT+02:00 Moritz Lennert
>> >> >> >> > ><mlennert at club.worldonline.be>:
>> >> >> >> > >
>> >> >> >> > >Hi,
>> >> >> >> > >
>> >> >> >> > >> [I imagine your mail was supposed to go onto the mailing
>> >> >> >> > >> list
>> >> >> >> > >> and
>> >> >> >> > >> not just to me...]
>> >> >> >> > >
>> >> >> >> > >sure my answer was for everyone to read, I believe I tried to
>> >> >> >> > > send
>> >> >> >> > > it
>> >> >> >> > >again afterwards..
>> >> >> >> > >something must have gone wrong.
>> >> >> >> > >
>> >> >> >> > >> I just presented GRASS and a short overview over GRASS on
>> >> >> >> > >> HPC
>> >> >> >> > >> yesterday at the FOSS4F-FR and there was a lot of interest
>> >> >> >> > >> for
>> >> >> >> > >> this. Several people asked me about specific documentation
>> >> >> >> > >> on
>> >> >> >> > >> the
>> >> >> >> > >> subject.
>> >> >> >> > >
>> >> >> >> > >What we did about GRASS + HPC was for specific production
>> >> >> >> > > purposes
>> >> >> >> > >and no documentation
>> >> >> >> > >whatsoever wascreated, basically due to lack of time.. so I
>> >> >> >> > > find
>> >> >> >> > > it
>> >> >> >> > >hard to say whether this is going
>> >> >> >> > >to change in the near future:). Surely the topic is of wide
>> >> >> >> > > interest
>> >> >> >> > >and worth being discussed in
>> >> >> >> > >several contexts.
>> >> >> >> > >
>> >> >> >> > >> Currently, I'm aware of the following wiki pages which each
>> >> >> >> > >> potentially touches on some aspects of HPC:
>> >> >> >> > >
>> >> >> >> > >I must admit that existing documentation/papers did not help
>> >> >> >> > > much.
>> >> >> >> > >Well, did not help at all, actually.
>> >> >> >> > >One major problem in my opinion/experience is that
>> >> >> >> > >multi-core/multi-node machines can be really
>> >> >> >> > >different from each other, and parallelization strategies very
>> >> >> >> > >purpose-specific, so that creating
>> >> >> >> > >general-purpose documents/papers, or even software, *may* be a
>> >> >> >> > >hopeless effort. Smart ideas
>> >> >> >> > >are most welcome, of course:)
>> >> >> >> >
>> >> >> >> > Dear Massimo and all,
>> >> >> >> >
>> >> >> >> > Being a beginner in massively processing Landsat 8 images using
>> >> >> >> > JRC's
>> >> >> >> > JEODPP system (which is designed for High-Throughput,
>> >> >> >> > https://doi.org/10.1016/j.future.2017.11.007), I found useful
>> >> >> >> > notes
>> >> >> >> > in
>> >> >> >> > the Wiki (notably Veronica's excellent tutorials) and
>> >> >> >> > elsewhere,
>> >> >> >> > got
>> >> >> >> > specific answers through the mailing lists and learned a lot in
>> >> >> >> > on-site discussions during the last OSGeo sprint, for example.
>> >> >> >> >
>> >> >> >> > Nonetheless, I think to have learned quite some things the hard
>> >> >> >> > way.
>> >> >> >> > In this regard, some answers to even "non-sense" questions are
>> >> >> >> > worth
>> >> >> >> > documenting.
>> >> >> >> >
>> >> >> >> > My aim is to transfer notes of practical value. Having HPC and
>> >> >> >> > HTC
>> >> >> >> > related notes in a wiki, will help to get started, promote best
>> >> >> >> > practices, learn through common mistakes and give an overview
>> >> >> >> > for
>> >> >> >> > the
>> >> >> >> > points Peter put in this thread's first message.
>> >> >> >>
>> >> >> >> +1
>> >> >> >>
>> >> >> >> >
>> >> >> >> > I hope it's fine to name the page "High Performance Computing".
>> >> >> >> > Please
>> >> >> >> > advise or create a page with another name if you think
>> >> >> >> > otherwise.
>> >> >> >>
>> >> >> >>
>> >> >> >> +1
>> >> >> >>
>> >> >> >> Moritz
>> >> >> >> _______________________________________________
>> >> >> >> grass-user mailing list
>> >> >> >> grass-user at lists.osgeo.org
>> >> >> >> https://lists.osgeo.org/mailman/listinfo/grass-user
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > grass-user mailing list
>> >> >> > grass-user at lists.osgeo.org
>> >> >> > https://lists.osgeo.org/mailman/listinfo/grass-user
>> >> >
>> >> >
>> >
>> >
>
>


More information about the grass-user mailing list