[GRASS-dev] GRASS 6.3 on Cygwin

Glynn Clements glynn at gclements.plus.com
Sun Jul 15 20:30:08 EDT 2007


benjamin.ducke at ufg.uni-kiel.de wrote:

> Glynn and all of you interested in getting rid of the Win32 DBMI deadlock
> problem,

This message didn't make it to the list; I suspect that it exceeds the
upper limit on the size of a message.

> I have spent some time adding debugging statements to the parts that Glynn
> suggested. The result is that the problem is definitely in the XDR I/O
> handling. Basically, the problem is a premature EOF in DBMI driver/client
> communication via piped I/O streams.
> 
> There are just two situations where the program hangs and both of
> them occur in function xdrstdio_getlong() in xdr_stdio.c (part of the SUN
> RPC library). This is one of four functions in xdr_stdio.c which read and
> write data from/to the DBMI client and driver streams (called STREAM1 and
> STREAM2 in the debugging output) connected via the pipelines. I have added
> verbose debugging statements to all four functions showing you what is
> read/written, to what address in memory and the contents of that address
> before and after the I/O operation.

Some suggestions:

0. Separate the client/driver output; see below.

1. omit the addresses; they don't tell you anything useful, and they
enlarge the output.

2. Write one record per line, with the output formatted to aligned
columns; this will make it easer to read. In particular, it should
make it easier to compare the client and driver logs.

3. For the get/put bytes operations, output the actual data (hex will
probably be simplest and most useful).

4. Swap the 2nd and 3rd arguments (size+count) of the fread/fwrite
calls; i.e. rather than reading a long as 1 4-byte record, read it as
4 1-byte records. This has the advantage that in the event of a short
count, you'll find out exactly how many bytes were read.

> If you take a look at the 10 logfiles that I have attached to this email,
> you will see the two deadlock contexts. The more common one looks like
> this:
> 
> GETBYTES: STREAM 2 at addr (2013505168).
>          addr (3654160) length = 3.
>          stream count was: -19281 Bytes.
>          --> OK.
>          stream count is now: -19284 Bytes.
> GETLONG: STREAM 2 at addr (2013505168).
>          put to addr (2293516); current val = 2013505200.
>          stream count was: -19284 Bytes.
>          --> FAILED: count = 0. EOF.
> dbmi: Protocol error
> 
> (the "dbmi: Protocol error" occurs after aborting with CTRL+C).
> 
> It is always the same: some caller tries to get a long value from stream 2
> but finds the stream empty (EOF).

Actually, this isn't necessarily EOF. Because you're always reading 1
record (of however many bytes), a return value of zero could mean that
there were too few bytes rather than none at all.

> The function xdrstdio_getlong() returns
> a FALSE value in this case. But this does not seem to be handled in any
> way other than sending the program to sleep.
> 
> The other, much less frequent one looks like this:
> 
> PUTLONG: STREAM 1 at addr (2013505200).
>          get from addr (2293456); current val = 0.
>          stream count was: 33852 Bytes.
>          --> OK.
>          new val = 2293332 written to stream.
>          stream count is now: 33856 Bytes.
> GETLONG: STREAM 2 at addr (2013505168).
>          put to addr (2293568); current val = 206.
>          stream count was: -7448 Bytes.
>          --> FAILED: count = 0. EOF.
> 
> Very similar (see log-03.txt).
> 
> Now, I have added some debugging code to xdrprocedure.c (in GRASS
> lib/db/dbmibase)
> to see what the caller contexts are, but it's hard to make sense of the
> output as it does not seem to be in sync with XDR I/O.

It looks as if both the client and driver are logging to the same log
file. AFAIK, the driver will inherit the client's stderr, so you'll
end up with the client's and server's output interleaved.

I suggest explicitly opening a separate log file with a unique name. 
Use setbuf(fp, NULL) to disable buffering, to ensure that you don't
lose the tail end of the file if the process terminates abnormally.

In order to understand what's going on, it's necessary to be able to
match the client's writes with the driver's reads, and vice-versa.

> However, if you check
> the full log output, you will find always find the values "203" and "206"
> in the streams which indicate that the latest callers were (see GRASS
> include/dbmi.h):
> 
> DB_PROC_FETCH			203
> DB_PROC_OPEN_SELECT_CURSOR	206
> 
> This corresponds to the debugging output from xdrprocedure.c.
> So I also added some debug statements to c_fetch.c (in GRASS
> lib/db/dbmi_client).

I don't see anything obviously wrong with those procedures (at least
in the case where no errors occur; if there's an error, you cannot
recover from it).

> Re. my debugging code: you will notice that the count for the stream size
> sometimes goes < 0. I don't know if that's because I am missing some
> writes or adding wrong size counts.

The get operations only ever subtract from the count, while the put
operations only ever add to it. IOW, the count will always be <= 0 for
a read stream and >= 0 for a write stream. Each process (client,
driver) has one read stream and one write stream; there are no
bi-directional streams.

> Anyway, I have attached the sources with my debugging codes so you can judge
> for yourselves if the output makes any sense.
> 
> Any ideas how to proceed from here?

Separate the client and driver output, and re-format it as described
above, so that it's straightforward to match the client's writes with
the driver's reads, and vice-versa.

-- 
Glynn Clements <glynn at gclements.plus.com>




More information about the grass-dev mailing list