[GRASS-dev] Re: what if: a new GRASS directory layout?

Ivan Shmakov ivan at theory.asu.ru
Tue Apr 8 13:53:58 EDT 2008


>>>>> Glynn Clements <glynn at gclements.plus.com> writes:

 >>> Full transaction support would be nice, but I don't know if it's
 >>> worth the substantial effort involved in implementing it.

 >> BTW, what do you mean by full transaction support here?

 > Actually, I was referring specifically to atomicity, i.e. being able
 > to replace a single map atomically, with no interval where the map
 > doesn't exist or is in an inconsistent state.

	ACK.

 > I suppose that there are cases where it might be useful to be able to
 > update multiple maps atomically, but that's even more work (you would
 > need an inventory for the entire mapset, not just for indivdual
 > maps).

	Since the scheme I've suggested reduces a raster to a structured
	inventory and a set of uniquely-named binary objects, it looks
	RDBMS-friendly as well.  I wish I had the time to explore
	putting some GRASS rasters into an PostgreSQL database!

 >>> In any case, all of these mechanisms would require substantial
 >>> changes to a large number of modules.

 >> I'll try to check whether I could make the new layout available
 >> under the old API as well.

 > The main problem is that a module reads a map via several calls. All
 > of those calls must see the same map.

 > I initially thought that you would need some form of locking, to
 > prevent the map from being replaced in the middle of the sequence.
 > However, you could achieve the same result by caching the inventory
 > within the module, but the code which garbage-collects unreferenced
 > elements would need to allow for this.

	Yes.  The need for GC seems to be the weekest point of this
	scheme.

	On Unix, as a first approximation, I'd just open () every binary
	object that's referenced by the inventory being processed.  This
	way, even if the file loses its name, it would be available to
	the program.

	In general, every binary object would need a list of references.
	Maintaining a list of names of referencing rasters shouldn't be
	too hard to implement.  On the contrary, a list of PIDs (to
	allow for a raster to be referenced by a process) looks a bit
	fragile.

 > Even there, you could run into problems where a module invokes
 > another module; the child would need to use the same version of the
 > map as the parent.

	Agreed.

	Actually, the only proper solution to this problem that I know
	is moving the whole computation chain into a ``parallel
	existence'' -- forking a separate copy-on-write location at the
	beginning of a ``transaction block'', and merging it back when
	it's done.  And while I hope that something like this will
	eventually be available in GRASS, I probably wouldn't say that
	the current code base is anywhere near that.



More information about the grass-dev mailing list