[GRASS-dev] Adding an expert mode to the parser

Sören Gebbert soerengebbert at googlemail.com
Wed Sep 28 15:03:04 PDT 2016


Hi,

[snip]
> >
> > As an example, when aiming at processing all Sentinel-2 tiles
> > globally, we speak about currently 73000 scenes * up-to-16 tiles along
> > with global data, analysis on top of other global data is more complex
> > when doing each job in its own mapset and reintegrate it in a single
> > target mapset as if able to process then in parallel in one mapset by
> > simply specifying the respective region to the command of interest.
> > Yes, different from the current paradigm and not for G7.
>
> from our common experience, I would say that creating separate mapsets
> is a safety feature. If anything goes wrong with that particular
> processing chain, cleaning up is easy, simply delete this particular
> mapset and run the job again, if possible on a different host/node
> (assuming that failed jobs are logged). Anyway, I would be surprised
> if the overhead of opening a separate mapset is measurable when
> processing all Sentinel-2 tiles globally. Reintegration into a single
> target mapset could cause problems with regard to IO saturation, but
> in a HPC environment temporary data always need to be copied to a
> final target location at some stage. The HPC system you are using now
> is most probably quite different from the one we used previously, so
> this is a lot of guessing, particularly about the storage location of
> temporary data (no matter if it is the same mapset or a separate
> mapset).
>

Imagine you have a tool that is able to distribute the processing of a
large time series of satellite images across a cluster. Each node in the
cluster should process a stack of r.mapcalc, r.series or r.neighbors
commands in a local temporary mapset, that gets later merged into a single
one. A single stack of commands may have hundreds of jobs that will run in
a single temporary mapset. In this scenario you need separate region
settings for each command in the stack, because of the different spatial
extents of the satellite images. The size of the stack depends on the size
of the time series (number of maps) and the number of available nodes.

Having region setting options in the parser will make the implementation of
such an approach much easier. Commands like t.rast.algebra and
t.rast.neighbors will directly benefit from a region parser option,
allowing the parallel processing of satellite image time series on a
multi-core machine.

Best regards
Soeren


> To be continued in a GRASS+HPC thread?
>
> Markus M
>
> >
> > But my original comment was targeted at the increasing number of
> > module parameters and how to handle that (with some new HPC related
> > idea as an example).
> >
> > I'm fine to archive this question for now, it will likely come up
> > again in the future.
> >
> > markusN
> _______________________________________________
> grass-dev mailing list
> grass-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20160929/36ecefd7/attachment.html>


More information about the grass-dev mailing list