[GRASS-dev] Using GRASS in long running and multithreaded applications Was: Re: The tomcat shut down ...

Soeren Gebbert soerengebbert at googlemail.com
Sat Sep 26 16:34:16 EDT 2009


Hello Glynn,

2009/9/26 Glynn Clements <glynn at gclements.plus.com>:
>
> Soeren Gebbert wrote:
>
>> >> An other approach might be the implementation of reset functions in
>> >> every source file which uses
>> >> static variables. The reset function can be called  for a fine grain
>> >> reset approach or all together
>> >> in a single reset function which collects all distributed reset functions.
>> >>
>> >> i.e:
>> >> New reset functions like:
>> >>
>> >> G_reset_geodesic_equation in lib/gis/geodesic.c
>> >> G_reset_window in lib/gis/get_window.c
>> >> G_reset_cell_area_calculations in lib/gis/area.c
>> >>
>> >> And so on ...
>> >
>> > That will get around the namespace issue.
>> >
>> > However, it doesn't really help with multithreading. For that, you
>>
>> Indeed.
>>
>> > would want to store all of the static data in a library in a state
>> > structure, and give each thread a pointer to a separate state
>> > structure (e.g. pthread_setspecific).
>>
>> IMHO this approach will result in a re-design and re-write of most of
>> the grass libraries
>> and an update of more than 300 modules ... .
>> Looks like a 2 1/2 years full time job for one developer ... any
>> sponsors available? :)
>
> It will require refactoring the libraries, but it doesn't require any
> interface changes.
>
> I'm not suggesting making all of the functions take a pointer to the
> state as a parameter, just making it thread-local.

Ok.
To my shame i have to admit that i never heard of the thread-local
mechanism before.
After a quick look at wikipedia i understand the principal and it sounds great!
This will speed up things a lot.
I guess we need to use the pthread version of thread-local to support
other compiler than gcc and windows too?

>
>> I think implementing the reset functions is the first step to use
>> grass in long running applications.
>> Within those application it must be assured that all the grass library
>> functions are called from
>> within the same thread, using a producer-consumer pattern.
>
> It shouldn't be necessary to be quite that restrictive. 7.0 already
> includes changes which make it practical to use multiple threads. You
> can't e.g. read or write a single map from multiple threads, but
> reading and writing different maps in different threads should work,
> as should functions which only query the state (first-use
> initialisations are protected by a mutex, so so you don't need to
> worry about concurrent read operations both trying to initialise the
> state).

Great news, i will test this with some java code soon. I have therefor
ported the
vtkGRASSBridge to grass7.

>
> However, the error handling is probably a bigger issue. Pushing error
> handling onto the modules isn't an acceptable solution.

Indeed. This was the next issue i would like to talk about.

> Simply allowing the fatal error handler to longjmp() out then resume
> using the GRASS libraries would be non-trivial, as you would have to
> repair any inconsistencies in the library state.

Is there an alternative to longjmp() and setjmp()? It seems to be quite complex,
the man page warns about the usage. And i never used it before.

> Allowing G_fatal_error() to return is enough work that it can probably
> be ruled out. Apart from changing every single call (I count 520
> references in lib/*), almost every public function would need two
> versions: one which returns an error code and one which treats errors
> as fatal (i.e. only returns upon success).

520 calls are indeed a lot. The raster and gis libraries all together
have 70 calls and
the vector and db libraries have 190 calls.
Glynn, if you can point me to a concrete implementation concept, i
would like to start to patch the gis, raster, vector and db libraries
in grass7.
Maybe we can use signals to set an error variable in the resume error function?

I focus on those libraries, because i have currently applications in
mind which implements raster and vector algorithms in highly multi
threaded environments (servlet or EJB container) and high performance
visualization applications (VTK/Paraview).
I believe the future of grass can be the backbone of many WPS server,
if we can get the core libs thread safe and ready for long running processes.

>> Glynn, if i remember correctly you described a similar approach for a
>> multi-threaded raster library
>> some time ago. Are there any plans to implement such an design in the
>> grass7 raster lib?
>> I have no experience with such an approach, but i would like to
>> realize it in a Java application ... someday. :)
>
> Built-in multi-threading is currently "blue sky", i.e. I know what the
> main issues are but I don't have any plan to solve them.
>
> The main issue for concurrent reading is that the raster library
> caches the current row, so that upscaling doesn't read and decode each
> row multiple times. That's problematic if you want multiple threads
> reading the same map.

Reading single raster maps in different threads is just great. Everything else
is like icing on the cake.

Best regards
Soeren

>
> --
> Glynn Clements <glynn at gclements.plus.com>
>


More information about the grass-dev mailing list