[GRASS-dev] Re: what if: a new GRASS directory layout?

Glynn Clements glynn at gclements.plus.com
Sat Apr 12 16:48:39 EDT 2008


Ivan Shmakov wrote:

>  > If you use NFS, either you ensure that locking works, or you
>  > essentially give up the ability to safely perform concurrent file
>  > access; this problem isn't limited to GRASS.
> 
> 	As I've noted before, the inventory scheme allows for lock-less
> 	read-only access to be performed over NFS in presence of
> 	concurring writes, but without GC.
> 
> 	Unfortunately, enabling locking protocol seems to impact the
> 	reliability.  In particular, I wonder, could there be a way for
> 	the server to know whether the client that acquired the lock is
> 	still alive?  Seemingly this one reduces to the two generals'
> 	problem.

By "client", do you mean the host or the process? When the process
terminates, any files which it has open will be closed, and any locks
will be released. This is handled by the kernel.

If a host which is an NFS client dies or loses network connectivity,
the server won't know until a timeout occurs. That should be a rare
occurrence, though.

>  >>> The inventory system can prevent inconsistencies without any
>  >>> locking. However, there's still the potential for updates to be
>  >>> discarded if you don't have locking. I.e. if two process attempt to
>  >>> update a module concurrently, the update which completes first will
>  >>> be lost.
> 
>  >> Agreed.  Though I'm not sure that the only reasonable way to prevent
>  >> this is to make any concurrent write access to fail.
> 
>  > I'm not suggesting having concurrent access fail, just waiting until
>  > the operation is safe (e.g. F_SETLKW rather than F_SETLK).
> 
>  > If modules are structured correctly (i.e. all support files are read
>  > in a burst before starting to read the data, or written in a burst
>  > after writing the data), both the probability of contention and the
>  > resulting delay will be minimal.
> 
>  > In the case of multiple concurrent writes, you're bound to lose one
>  > of the two writes whichever mechanism is used.
> 
> 	I don't understand how the schemes being discussed are different
> 	with respect to this point?

They aren't.

The only way that this issue can be prevented is if each module is
treated as a single transaction. In practice, it shouldn't be that
much of a problem; if two processes are trying to replace the same map
with competing versions, that's really an organisational issue, not a
technical issue.

>       The only way not to lose the data
> 	just computed will be to make the attempt to open the target
> 	GRASS object for write to fail in presence of a concurrent
> 	process.

Yes; IOW, the transaction covers the entire process, not just the
final step where the existing map is replaced with an updated version.

Whether or not this is even desirable is open to debate. And if you do
want to prevent this, do you want to wait until the first process
completes (which could be a long time if the process is e.g. r.flow),
or just make the second process fail with a "busy" error.

I'm only interested in making individual read and write operations
into atomic transactions, so that a reader which opens a map while it
is being replaced won't see an intermediate state.

>  >> FWIW, the appropriate locking to a writable mapset could be
>  >> implemented without fcntl ().
> 
>  > There are other ways to do it (e.g. lock files), but the OS' locking
>  > primitives are the simplest to use (no chance of stale lock files, no
>  > hacks to deal with NFS' non-atomicity, built-in deadlock detection,
>  > etc).
> 
> 	Yes.
> 
> 	From the discussion, I could conclude that the inventory scheme
> 	doesn't necessary imply any better concurrency (other than by
> 	allowing certain ways of lock-less access), nor does it prevent
> 	the concurrency to be improved (by means of the OS locking.)
> 
> 	Still, I believe the inventory scheme to be more flexible and to
> 	allow for both cleaner and extensible API, and a cleaner
> 	implementation.

The main disadvantage is it makes it harder for the user to analyse or
modify the database manually (which is sometimes necessary).

-- 
Glynn Clements <glynn at gclements.plus.com>


More information about the grass-dev mailing list