[GRASS-dev] [GRASS GIS] #2134: Create a general exit-safe interface to C libraries
GRASS GIS
trac at osgeo.org
Wed Nov 20 02:12:37 PST 2013
#2134: Create a general exit-safe interface to C libraries
--------------------------------------------------+-------------------------
Reporter: wenzeslaus | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Python ctypes | Version: svn-trunk
Keywords: G_fatal_error, exit, multiprocessing | Platform: All
Cpu: Unspecified |
--------------------------------------------------+-------------------------
Comment(by huhabla):
Replying to [comment:9 wenzeslaus]:
> Replying to [comment:6 glynn]:
> > Replying to [comment:4 zarch]:
> >
> > > Or as already suggested by Glynn (#1646) wrap the
G_add_error_handler,
> >
> > Using an error handler allows you to avoid process termination. But
once a fatal error has occurred, you cannot safely call any GRASS
function; doing so may well result in a segfault.
>
> One of the issues which `G_add_error_handler` is trying to solve is to
provide meaningful error message to the user. For example, failing to open
some temporary file causes `exit` with "`No such file /tmp/kjewbf8d38dj`".
This does not help user nor programmer to understand that the error
occurred (when does this happened, what is the stack trace, what are
consequences and what are suggestions to solve it). In other words,
sometimes the message provided by `G_fatal_error` caller is too low level.
>
> Python `RPCServer` with wrapper functions throwing exceptions would help
to solve this issue. But it seems to me that #1646 remains valid for
pyGRASS (and possibly others) and C code itself.
I have re-designed the RPC interface, now the Python function wrapper will
return an exception and the result of the function calls, so that the RPC
server interface that provides the {{{call()}}} functions can raise these
exceptions (exceptions raised in the subprocess will kill the subprocess
and will not be catch'd in the parent process). Hence, the Python wrapper
functions transform the C-function return values into meaningful
exceptions that will be raised in the parent process.
While re-designing i concluded that a no wait function call
{{{call_no_wait()}}} is not meaningful when mixed with calls that wait to
receive data. There is only a limited number of C-functions that do not
return values or return states. It is better to wait for a function call
to finish, than risking a race condition in case a fatal error occur'd
meanwhile. An exception is the messaging interface, which should stay as
is.
However, maybe two RPC interfaces are meaningful: one that waits for
functions to return (expecting return values including exceptions) and one
that does not wait?
> Replying to [comment:7 huhabla]:
> > In my opinion the RPC approach is only meaningful for persistent
applications that need fast access to C-library functions, or that need
low level API access for data modification (like digitizing).
>
> And this is not only vector and raster digitizing, this is also new
scatter plot tool and in fact the whole `g.gui.iclass`, `nviz` (which is
unfortunately more complicated) and of course everything temporal-related
(everything started to be temporal-related).
>
> > My intention to write the RPC server was to make the temporal
framework usable in persistent applications and to be as fast as possible.
>
> I'm not sure how the speed of `RPCServer` compares to module call but
the speed is not the only advantage. Fine control of what is called and a
smoother interface (possibly, depending on wrappers) is the other
advantage. Calling subprocess from GUI for every single task and parsing
its output is cumbersome.
I have added benchmark runs to the rpc server script, to get an idea what
the performance loss and gain of the RPC interface is:
{{{
GRASS 7.0.svn (Test XY):~ > python c_library_interface.py
##############################################################
TESTS
ERROR: A fatal error
WARNING:root:Needed to restart the rpc server
ERROR: A fatal error
WARNING:root:Needed to restart the rpc server
##############################################################
##############################################################
Raster map exists benchmark
Time to call 1000 functions directly: 0.017043s
Time to call 1000 functions via RPC: 0.178600s
Time to perform 1000 g.findfile module runs: 30.343877s
##############################################################
##############################################################
Raster map info benchmark
Time to call 10000 functions directly: 0.856104s
Time to call 10000 functions via RPC: 7.189188s
Time to perform 10000 r.info module runs: 120.261683s
##############################################################
}}}
As you can see the RPC interface is for the two tested functions about 10
times slower then the direct Python function calls that wrap the GRASS
C-functions. But the RPC interface is about 17 to 600 times faster then
using the grass.script interface that calls GRASS modules (g.findfile and
r.info).
> > The pyGRASS interface is well designed for module programming not for
persistent applications. Otherwise each C-function call should be handled
via RPC. From my point of view and some tests that i made slows the RPC
approach the processing significantly down.
>
> So, in the next step, we need pyGRASS-like interface which is fail safe
and temporal library which is faster?
So you want to wrap all C-function calls in PyGRASS to be wrapped using
the RPC interface?
> > Running the script will show that calling functions from a dict is 50
time faster than using a pipe with a subprocess.
>
> It seems that this is something we need to take care of. And this is
something what my factory pattern suggestions is trying to address.
>
> We would need to create two sets of class with identical interface. One
using `RPCServer` (safe) the other calling ctypes directly (fast). Objects
should be created by factory, so that the factory will put the `RPCServer`
into the objects, so the user does not take care of it. Maybe in Python we
can go beyond the classic factory pattern and create also the
`RPCServer`-dependent classes from the classes calling ctypes directly.
>
> I realize that ''some code'' would be appreciated but I cannot dive into
this more now.
I am not sure if i understand your approach, so code examples would be
very helpful here. :)
--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2134#comment:10>
GRASS GIS <http://grass.osgeo.org>
More information about the grass-dev
mailing list