[postgis-users] Dilemma: is gid really useful in a geoDB?

Fri Mar 2 01:44:03 PST 2007

Hi, Marc,

Marc Compte <mcompte at sigte.udg.es> wrote:

> I find it surprising to see that "there's no need for a gid, from the 
> database point of view" and that primary keys are only needed because 
> "most applications want some way of identifying specific rows", 

You misinterpreted my statement.

As you wrote yourself, the means of identifying a row may be an
arbitrary combination of columns. It might even be all columns
together. For some applications, even the geometry itsself might
serve as the primary key.

And the relational model is designed to work on sets of rows, the
primary key is simply a means of constraining such a set to a single
row. Whether and where you need that depends on your application and
data model.

So, e. G., for a simple map painting application, you use queries that
fetch all data from a specific area (via && operator, and GIST index),
possibly filtered by some criteria (route_type to only get motorways
when zoomed out), no need for primary keys in that case.

> Every table needs to have a key, not only to conform with the relational 
> model's theory, but for pragmatical purposes as well. Even (or 
> specially) from the database point of view, no key means there is no way 
> to update a unique field, there is no way to extract the information 
> about one individual record, there is no way to cross-reference tables 
> ... in short, no key means something like there is no relational 
> database, only a flat list of data ...

And in some of our applications, we just need such flat lists of data.
There are no updates, only additions, and the data is processed
bunch-wise (via time-range queries, geometrical area queries, etc. and
possibly aggregates).

Those applications are not the typical OLPC ones.

But it stays correct that the database itsself does not need any keys
per se to work, its a requirement of specific applications and data
models. And, of course, it's important that a database provides ways to
enforce uniqueness and referential integrity. (Hello, MySQL? :-)

But what I really wanted to complain about is the braindeadness that
some GIS viewers blindly assume that there is a primary key column named
"gid" and of type "int4", with no way to convince them that:

- there's a unique "rowid" of type "int8" they can use
- the unique indexed tuple (gid int4, area_id int4) is a pk they can
  use, as "gid" is only unique per shapefile (= area).
- they should not segfault when "gid" is of type "int8"
- for pure viewing purposes, no PK is needed at all.

Regards,
Markus

-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org