[GRASS-dev] Some progress on Win32 attribute deadlock problem

Benjamin Ducke benjamin.ducke at ufg.uni-kiel.de
Mon Jul 9 13:01:43 EDT 2007


OK, I tried some of the suggestion Glynn made below:
> 
> I'm referring to the situation where read/write return a short count,
> e.g. "write(fd, buf, count) < count". But, I've remembered that XDR
> uses stdio rather than POSIX I/O, and I don't think that fread/fwrite
> can return a short count (except for EOF).
> 
> According to the MSVCRT documentation, the O_NOINHERIT flag should be
> used when using _dup2(), e.g.:
> 
> 	_pipe(p1, 250000, _O_BINARY|_O_NOINHERIT)
> 
> Another suggestion: try changing the size passed to the _pipe()
> function in dbmi_client/start.c. If that affects the tendency to
> deadlock, it strongly suggests that the issue is related to the way
> that a full pipe is handled.
> 
> Beyond that, the only thing which I can suggest is to instrument the
> XDR code with debug code to log all I/O operations (including the data
> which is read/written).
> 

After hundreds of test runs with different Windows versions, these
are my conclusions:

The problem has to do with the pipe mechanism in Windows.
I tried changing the pipe size as suggest, using extremely small (25)
and extremely high (250000000) values. On Windows 2000, with
the very small value, no module run makes it past 33 percent. So there
is a clear correlation. As soon as I set it to some "sane" value
(at least 25000), I get the same situation: ca. 4-6 out of 50 runs
complete. Increasing the value from here won't make a difference,
the differences are always within measuring precision.

This is no surprise, since the comment in dbmi_client/start.c states
that the pipe buffer value is not directly related to the pipe size.
Apparently, Windows choose some fixed value as soon as the size
is greated than some threshold. The same thing happens when I set
the size to "0".

However, the fact that I can block the piping effectively with
very small values leaves me believing that this is, as Glynn
suggests, the source of the troubles:
A full pipe gets stuck and no process ever takes anything out of
it to make some room, so the next bit of data cannot be pushed into
it. Puller waits for pusher, pusher never pushes, because nothing
gets pulled = deadlock. (I think...)

Another thing makes me believe that Windows itself is the culprit
here: I tested the same stuff on a Windows XP SP2 system, clean
install from scratch. On this system, almost all the runs (97%)
finished cleanly!

Obviously MS did some improvements to process communication in that
release ...

Setting the _NO_INHERIT flag makes no difference.

So, how are we going to go ahead?

Best,

Benjamin


-- 
Benjamin Ducke, M.A.
Archäoinformatik
(Archaeoinformation Science)
Institut für Ur- und Frühgeschichte
(Inst. of Prehistoric and Historic Archaeology)
Christian-Albrechts-Universität zu Kiel
Johanna-Mestorf-Straße 2-6
D 24098 Kiel
Germany

Tel.: ++49 (0)431 880-3378 / -3379
Fax : ++49 (0)431 880-7300
www.uni-kiel.de/ufg




More information about the grass-dev mailing list