[GRASS-dev] Re: what if: a new GRASS directory layout?

Ivan Shmakov ivan at theory.asu.ru
Sat Apr 12 13:09:22 EDT 2008


>>>>> Glynn Clements <glynn at gclements.plus.com> writes:

 >>> That can be handled by locking, and it can probably be done without
 >>> significant changes to the API.

 >> Oh, I see that you mean fcntl () locks here.  But weren't there
 >> issues with them on certain OS' (W32) and FS' (NFS, in particular
 >> with -o nolock)?

 > Win32 has locking primitives.

	ACK.

 > And NFS loses for more reasons than just locking, e.g. rename() isn't
 > guaranteed to be atomic. It is for good reason that NFS is widely
 > believed to stand for "Not a File System" (as it violates a
 > significant portion of POSIX).

 > If you use NFS, either you ensure that locking works, or you
 > essentially give up the ability to safely perform concurrent file
 > access; this problem isn't limited to GRASS.

	As I've noted before, the inventory scheme allows for lock-less
	read-only access to be performed over NFS in presence of
	concurring writes, but without GC.

	Unfortunately, enabling locking protocol seems to impact the
	reliability.  In particular, I wonder, could there be a way for
	the server to know whether the client that acquired the lock is
	still alive?  Seemingly this one reduces to the two generals'
	problem.

 >>> The inventory system can prevent inconsistencies without any
 >>> locking. However, there's still the potential for updates to be
 >>> discarded if you don't have locking. I.e. if two process attempt to
 >>> update a module concurrently, the update which completes first will
 >>> be lost.

 >> Agreed.  Though I'm not sure that the only reasonable way to prevent
 >> this is to make any concurrent write access to fail.

 > I'm not suggesting having concurrent access fail, just waiting until
 > the operation is safe (e.g. F_SETLKW rather than F_SETLK).

 > If modules are structured correctly (i.e. all support files are read
 > in a burst before starting to read the data, or written in a burst
 > after writing the data), both the probability of contention and the
 > resulting delay will be minimal.

 > In the case of multiple concurrent writes, you're bound to lose one
 > of the two writes whichever mechanism is used.

	I don't understand how the schemes being discussed are different
	with respect to this point?  The only way not to lose the data
	just computed will be to make the attempt to open the target
	GRASS object for write to fail in presence of a concurrent
	process.

	OTOH, a mechanism could be designed that would allow for
	arbitrary code to be run when certain operations are requested
	on the database (creation, update and removal of the maps, for
	example.)  With such a mechanism it may become possible to alter
	the name of the GRASS object being written instead of
	overwriting the one appeared in the middle of processing and
	bearing the same name.

	Though the solution of such a hard to imagine problem alone will
	hardly justify the effort.

[...]

 > IOW, ensuring individual read and write operations behave as an
 > atomic transaction. If you want a complete read-process-write
 > operation to behave as an atomic transaction, delays (or "busy"
 > errors) cannot be eliminated (and there's also the potential for
 > deadlock).

	Forking a ``parallel existence'' at the beginning of the
	processing apparently eliminates the possibility of deadlocks,
	though there certainly will be some problems with merging it
	back.  E. g., in PostgreSQL, a transaction is discarded with an
	error if it conflicts with changes already merged from other
	finished transactions by the time of merge.

 >> FWIW, the appropriate locking to a writable mapset could be
 >> implemented without fcntl ().

 > There are other ways to do it (e.g. lock files), but the OS' locking
 > primitives are the simplest to use (no chance of stale lock files, no
 > hacks to deal with NFS' non-atomicity, built-in deadlock detection,
 > etc).

	Yes.

	From the discussion, I could conclude that the inventory scheme
	doesn't necessary imply any better concurrency (other than by
	allowing certain ways of lock-less access), nor does it prevent
	the concurrency to be improved (by means of the OS locking.)

	Still, I believe the inventory scheme to be more flexible and to
	allow for both cleaner and extensible API, and a cleaner
	implementation.

	Hopefully, I'd have the time to experiment with it.



More information about the grass-dev mailing list