[GRASS-dev] native WinGRASS and attribute data deadlock, next try

Glynn Clements glynn at gclements.plus.com
Wed Sep 5 12:04:36 EDT 2007


Moritz Lennert wrote:

> >  >> So, how are we going to go ahead?
> >  >
> >
> > Glynn answered:
> >
> >  > Figure out how to debug the processes. If you can't get gdb to work, I
> >  > can only suggest logging every significant event at the lowest level,
> >  > i.e. log every read/write operation: the arguments, the return code,
> >  > and the complete data (i.e. the buffer contents before read and after
> >  > write). This is all done in the RPC/XDR library, in xdr_stdio.c. It
> >  > will probably help to also log the beginning/end of each procedure
> >  > call (i.e. lib/db/dbmi_base/xdrprocedure.c).
> >
> > I would really like this to be solved, so I am willing to try to find
> > some time to do the logging effort. Benjamin, have you advanced on this ?
> >
> > I will need some time understanding the xdr logic and code, but hope to
> > be able to help with this.
> >
> 
> Ok, very first simple debugging effort seems to confirm timin issue.
> Here's what I did:
> 
> diff -u dbmi_base dbmi_base_debug/
> Common subdirectories: dbmi_base/CVS and dbmi_base_debug/CVS
> Only in dbmi_base_debug/: OBJ.i686-pc-mingw32
> diff -u dbmi_base/xdrint.c dbmi_base_debug/xdrint.c
> --- dbmi_base/xdrint.c  Thu Oct  5 06:13:28 2006
> +++ dbmi_base_debug/xdrint.c    Mon Sep  3 20:17:35 2007
> @@ -10,10 +10,12 @@
> 
>      stat = DB_OK;
> 
> +    G_debug(1, "xdrint.c: Begin send");
>      xdr_begin_send (&xdrs);
>      if(!xdr_int (&xdrs, &n))
>         stat = DB_PROTOCOL_ERR;
>      xdr_end_send (&xdrs);
> +    G_debug(1, "xdrint.c: End send");
> 
>      if (stat == DB_PROTOCOL_ERR)
>         db_protocol_error();
> diff -u dbmi_base/xdrprocedure.c dbmi_base_debug/xdrprocedure.c
> --- dbmi_base/xdrprocedure.c    Thu Oct  5 06:13:28 2006
> +++ dbmi_base_debug/xdrprocedure.c      Mon Sep  3 20:17:35 2007
> @@ -40,10 +40,12 @@
> 
>      stat = DB_OK;
> 
> +    G_debug(1, "xdrprocedure.c: Begin receive");
>      xdr_begin_recv (&xdrs);
>      if(!xdr_int (&xdrs, n))
>         stat = DB_EOF;
>      xdr_end_recv (&xdrs);
> +    G_debug(1, "xdrprocedure.c: End receive");
> 
>      return stat;
>  }
> 
> and now, after setting 'g.gisenv set=DEBUG=1', I cannot reproduce the
> deadlock anymore, using Benjamin's test data, except when I do other
> things on the machine (open other windows, type an email, etc). When I
> just run the command and stare at the screen I get no deadlock. With
> DEBUG=0 I get the same irregular deadlock.
> 
> I'll dig into xdrstdio.c now.

If it is a timing issue, then you'll need to log the fread/fwrite
return values, along with the actual data, for both ends (client and
server).

Then, find out where the two diverge (i.e. what is received isn't what
was sent).

However, I have a strong suspicion that the eventual answer will be
"MSVCRT's stdio implementation sucks". I already know this to be true;
what I don't know is whether it's the cause of the DBMI problems and,
if so, how much stuff we will need to re-write.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-dev mailing list