[GRASS-dev] Using GRASS in long running and multithreaded applications Was: Re: The tomcat shut down ...

Fri Sep 25 21:11:15 EDT 2009

Soeren Gebbert wrote:

> >> An other approach might be the implementation of reset functions in
> >> every source file which uses
> >> static variables. The reset function can be called  for a fine grain
> >> reset approach or all together
> >> in a single reset function which collects all distributed reset functions.
> >>
> >> i.e:
> >> New reset functions like:
> >>
> >> G_reset_geodesic_equation in lib/gis/geodesic.c
> >> G_reset_window in lib/gis/get_window.c
> >> G_reset_cell_area_calculations in lib/gis/area.c
> >>
> >> And so on ...
> >
> > That will get around the namespace issue.
> >
> > However, it doesn't really help with multithreading. For that, you
> 
> Indeed.
> 
> > would want to store all of the static data in a library in a state
> > structure, and give each thread a pointer to a separate state
> > structure (e.g. pthread_setspecific).
> 
> IMHO this approach will result in a re-design and re-write of most of
> the grass libraries
> and an update of more than 300 modules ... .
> Looks like a 2 1/2 years full time job for one developer ... any
> sponsors available? :)

It will require refactoring the libraries, but it doesn't require any
interface changes.

I'm not suggesting making all of the functions take a pointer to the
state as a parameter, just making it thread-local.

> I think implementing the reset functions is the first step to use
> grass in long running applications.
> Within those application it must be assured that all the grass library
> functions are called from
> within the same thread, using a producer-consumer pattern.

It shouldn't be necessary to be quite that restrictive. 7.0 already
includes changes which make it practical to use multiple threads. You
can't e.g. read or write a single map from multiple threads, but
reading and writing different maps in different threads should work,
as should functions which only query the state (first-use
initialisations are protected by a mutex, so so you don't need to
worry about concurrent read operations both trying to initialise the
state).

However, the error handling is probably a bigger issue. Pushing error
handling onto the modules isn't an acceptable solution.

Simply allowing the fatal error handler to longjmp() out then resume
using the GRASS libraries would be non-trivial, as you would have to
repair any inconsistencies in the library state.

Allowing G_fatal_error() to return is enough work that it can probably
be ruled out. Apart from changing every single call (I count 520
references in lib/*), almost every public function would need two
versions: one which returns an error code and one which treats errors
as fatal (i.e. only returns upon success).

> Glynn, if i remember correctly you described a similar approach for a
> multi-threaded raster library
> some time ago. Are there any plans to implement such an design in the
> grass7 raster lib?
> I have no experience with such an approach, but i would like to
> realize it in a Java application ... someday. :)

Built-in multi-threading is currently "blue sky", i.e. I know what the
main issues are but I don't have any plan to solve them.

The main issue for concurrent reading is that the raster library
caches the current row, so that upscaling doesn't read and decode each
row multiple times. That's problematic if you want multiple threads
reading the same map.

-- 
Glynn Clements <glynn at gclements.plus.com>