[GRASS5] Some news: KerGIS

Thierry Laronde tlaronde at polynum.com
Mon Jan 26 09:30:11 EST 2004


On Mon, Jan 26, 2004 at 11:02:45AM +0100, Radim Blazek wrote:
> On Saturday 24 January 2004 15:08, Thierry Laronde wrote:
> > In the mean time (since the work is mainly boring, I need to have some
> > more exciting coding stuff) I will start developping new applications
> > in the DB field (where I keep strictly nothing).
> 
> Can you concretize 'DB field'?

I will simply sketch them since everything is not curved in stone at the
moment. This is also a demonstration of the need of the overview (the
big picture).

In the following when I speak about DB, it is meant in the common sense
nowadays, i.e "data handled via RDBMs". The "GIS Data Base" will be
called simply the GIS Data Repository: GDR.

I will take the example of the vector, since this is the one I use
commonly.

At the moment, the embryo of the DB is, indeed, the attributes files.
The link between geometrical data and textual one was made via the
attributes (labels), these attributes being translated to text via 
categories if somebody wanted.

So the DB was a one enregistrement with one field one.

A typical use ---I think--- now, is to put labels distinct for every
element in a layer (incrementally for example) and to link this label
with a field in a DB.

Consequences and future directions:
	- The attributes are doomed to disappear as well as the categories

	- Since the index used tend (and it's logical) to identify uniquely
	an element, the future vector format MUST support directly an
	identifier, and the format must be made so that one can specify the
	size of the identifers in  power of two tetrabytes (I follow Knuth
	terminology, tetrabyte == 4 bytes, and a byte is 8 bits), so that 
	the format will be able to handle tremendous amount of data [NOTE:
	not all the bits in the identifiants will be used for the
	identification, since for example, if the vector format will be a
	topological one, the some bits in the most significant
	byte will be used to code the orientation of the element].

	- KerGIS/GRASS is not a vectorial package per se, so the attributes
	commonly found in CADs (vectorial formats coding inside the
	attributes) will not be supported. The vectorial format will support
	the geometrical definitions, incorporate the topological information
	now found in the dig_plus (in fact I'm thinking of a special
	topological format ---but I will not say anything more for the
	moment since it is not the priority), the mean of the
	"support/build" operation being precisely to generate a fast
	searching index between geometrical elements and textuals attributes
	(in this view, if someone wants to specify particular visual aspects
	(color and so on) of elements, these will be placed in a table in a
	DB).

	=> Geometrical operations don't have to deal with attributes (you
	SELECT elements on an attribute basis XOR geometrical basis [the
	combination of the two is the combination of two distinct
	selections], and in the DB you need only the identifiant of the
	geometrical element NOT its geometrical definition (the geometrical
	extensions of some RDBMs will not be used since it's, in my mind,
	both from an efficiency and a logical point of view, an original 
	sin).

KegGIS shall support multiple distinct DB sources for a single layer.
These sources will be specified, in an URI style
(method:location_of_table#name_of_field_holding_the_identifiants) in a
newly created file in the GDR/$LOCATION/$MAPSET.

To be able to import data from different sources, a TEXT file, with
ASCII commands (the language will be specified via a 
LANG=<ISO_DEFINITION> so that we can handle languages in 7, 8 bits, 
or wyde chars for the text fields) with the following characteristics:

	- No loss of precision is allowed: the format will handle (since
	it's text, it's easy) infinite (to the extent of the place on mass
	storage devices...)  precision: the 1000 digits precision allowed by
	PostgreSQL will still be here;
	- No loss of features is allowed: functions will be described too;
	- The format must be easy to read and easy to scan.

	=> The reasons for the TEXT format are:
		- The import/export modules will have not n x n possibilities,
		but 2n : FOO->KerGIS_DB_FORMAT, KerGIS_DB_FORMAT->BLA
		-> hence the "no loss" constraints [this is not the case for
		example at the moment(say pg.in.dbf, but others are touched);
		- The use of CVS will be eased, since it's easier with text
		format and it's more efficient (for diffs);
		- allowing simple editing, debugging, text manipulations;

A corresponding binary format will be designed which will be accessed
via the common scheme (-KerGIS DB Format will simply be another type).

=> This proper format will not have to be tremendously customizable or
efficient. It will be a fallback for simple applications or people not
having other RDBMs at disposal. The "functions" will not be supported at
least at the beginning.

a $GDR/$LOCATION/$MAPSET/db directory will be created.

> 
> > 4. The name of the game
> >
> > KerGIS
> 
> Much better than YAG, but what does 'Ker' mean?

Ker <-> core: the essentials

Kernel GIS i.e. must allow the developpement of applications linking
directly to "system calls" from the "graphical kernel".

Ker in a "mathematical" sense meaning that for every GIS application
ker(application) = KerGIS i.e. where others have one billion different 
functions this can be done with  a single well designed system
call  in KerGIS (the mathematical correctness of the description is not
meant to be questionned too far; it's word play).


> 
> And from your original plans :
> > 6) Scalability: YAG must be able to run on clusters
> >         -> All non strictly interactive tools MUST run in batch mode 
> >         -> Client/server architecture
> >         -> Multithreading (not a priority, but must be thought about)
> 
> Do you think that to make the code thread save may be postponed?

As long as the base is not orthogonal, with the different parts 
(analysis, device access, presentation layer and so on)  not 
clearly identified and insulated, this must not be attempted.

Once the reorganization is achieved AND the big picture is gained, this
will allow too the parallelizing of efforts.
-- 
Thierry Laronde (Alceste) <tlaronde at polynum.org>
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C




More information about the grass-dev mailing list