[postgis-users] Dilemma: is gid really useful in a geoDB?

Antonello antonello.monetsen at gmail.com
Fri Feb 23 01:17:01 PST 2007


Thanks a lot to all of you!

This discussion revolve all my doubt about the use of gid, Thanks! I 
start the modeling of my database with the purpose to build a standards 
geoDB which I could use with several free, and eventually, no-free 
application. So I was scared that maybe gid was a standards for identify 
the feature in a layer. But after yours help I have a clear idea about it.

Thank you a lot, bye Antonello.

Mark Millman wrote:
> The problem is that the original SHAPE file doesn't have any attributes.  It
> has attribution only by a one-to-one row correlation with an associated DBF
> file.  Therefore the only way to associated the geometries found in the
> SHAPE file with the non-spatial attribution found in the DBF file is by the
> row number.  Blame ESRI and 30 years of history on this one.
>
> The argument of using a automatic number or sequence as opposed to "natural"
> data as a primary key is an old one and both sides of the argument have
> validity.  Simple numbers index faster, join faster, and are found in almost
> every database I've ever seen; however they are intrinsically meaningless.
> But there are a lot of "John Smith"s in the world so we invent things like
> Social Security Numbers and GIDs.
>
> Historically GIS has supported many data sets that have no attribution,
> other than their feature name; or they do have a unique "natural" attribute.
> There are a lot of edges to the Mississippi River and I'd hate to have to
> come up with a meaningful naming convention to keep them all unique.  One
> might be tempted to use a distance criteria; until one remembers that rivers
> move and the next elbow that is cut off would result in changes that would
> be very difficult to manage.  So the only opportunity for a primary key is
> the something like a GID.
>
> So something like GID is important if you want to associate records that do
> not have "natural" keys or if you want to avoid seriously long primary keys
> like "USA,WA,Whatcom,Bellingham,Mill Ave,1340" for a primary key.
>
> There is another argument for the use of a GID-like key and that is the
> separation of spatial data from attribution, in a manner similar to that
> used by some CAD based GIS like Bentley's Geographics and Intergraph's old
> MGE.  These systems allowed for a single spatial instance to serve as a
> topological boundary to many themes.  A single polyline could be the
> boundary of a River, Municipality, County, and State; all at the same time.
> Adjusting the line changed the all the associated boundaries simultaneously.
> Geographics created topology dynamically based on the boundaries and
> requested themes.  Themes had display priority so that the line was always
> rendered appropriately for the senior most theme in the display list. 
>
> There are reasonable arguments in favor of such model in a purely database
> solution too.  Each of the themes could have a foreign key to a common
> spatial data table rather than embedding the spatial data directly in the
> "theme" record.  This would avoid duplication of data, avoid double
> plotting, and allow applications to manage shared boundaries consistently.
>
> Mark Millman
> Mizar, LLC
> mark.millman at mizar.com
> www.mizar.com
> 589 S Beach Rd.
> Point Roberts, WA 98281
> (360) 945-2643
>
>
>
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Havard
> Tveite
> Sent: Thursday, February 22, 2007 3:07 AM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] Dilemma: is gid really useful in a geoDB?
>
> In the relational model, the primary key can be a combination
> of several attributes.  If you have such a combination of
> attributes, there is (from the relational model's point of
> view) no need for a separate gid.
> Also, if geometry is considered fist class data type of the
> (extended) relational model, location itself can, in principle,
> act as (or be a part of) a primary key (as long as you can
> guarantee that there is only one object per location).
>
> As pointed out in several other postings, the "problem" is that
> a lot of GIS software tend to rely on having a single attribute
> primary key, and that this key has to be of a specific type.
>
> The system (PostgreSQL) generated "oid" can also be used as a
> primary key.
>
> Håvard
>
> Marc Compte wrote:
>   
>> Hi,
>>
>> Ermm ... wasn't one of Codd's rules "Each and every datum (atomic value) 
>> in a relational data base is guaranteed to be logically accessible by 
>> resorting to a combination of table name, primary key value and column 
>> name."?
>>
>> I find it surprising to see that "there's no need for a gid, from the 
>> database point of view" and that primary keys are only needed because 
>> "most applications want some way of identifying specific rows", as the 
>> very core definition of the relational model clearly identifies this as 
>> a crucial part of the model. Wanting to identify specific rows is not an 
>> application oddity, it's part of the very definition of the model.
>>
>> Every table needs to have a key, not only to conform with the relational 
>> model's theory, but for pragmatical purposes as well. Even (or 
>> specially) from the database point of view, no key means there is no way 
>> to update a unique field, there is no way to extract the information 
>> about one individual record, there is no way to cross-reference tables 
>> ... in short, no key means something like there is no relational 
>> database, only a flat list of data ... and PostgreSQL, even though it 
>> incorportes OO elements, has still a big part of relational, right?
>>
>> Whether you have to be using that particular field (gid) or not depends 
>> on your data. I don't think naming your key field "gid" is defined in 
>> any standard (it shouldn't be anyway). As other people said, shp2pgsql 
>> automatically generates this one for you, but I assume that is just to 
>> make your life easier and make sure there is at least one way to 
>> uniquely identify records.
>>
>> The only important thing to do should be to make sure that there's at 
>> least 1 field with values that can be used to uniquely identify any of 
>> the records of your table. If you are sure there's another field (or 
>> combination of fields) that serves the purpose right, you wouldn't 
>> actually need to have that particular gid field. You should be able to 
>> delete it and still be able to use on that table all of the features of 
>> the database and any other surrounding application by using your own 
>> self-made primary key.
>>
>> If an application has a problem with any of this (multiple field keys, 
>> using a different name for the key field other than gid, ...) then you 
>> could probably say it is not well designed and you might want to switch 
>> to something else.
>>
>> Marc
>>
>> En/na Markus Schaber ha escrit:
>>     
>>> Hi, Antonello,
>>>
>>> Antonello <antonello.monetsen at gmail.com> wrote:
>>>
>>>  
>>>       
>>>> I have a dilemma: is gid really useful in a geoDB?
>>>> I am building a geodb for several harbor. When I create a layer(table 
>>>> with geometry) in postgis, I need a gid like a primary key, or I can 
>>>> use a different one? Can I  use several field of from the table like 
>>>> a primary key?
>>>> I am not new in the gis, but why in almost all geoDB I saw there is a 
>>>> gid?! Is it really necessary? Do it come from shapefile standards?
>>>>     
>>>>         
>>> Technically, there's no need for a gid, from the database point of
>>> view, as SQL deals with sets of rows.
>>>
>>> However, most applications want some way of identifying specific rows,
>>> especially when updating / editing data.
>>>
>>> GID is generated by shp2pgsql.
>>>
>>>  
>>>       
>>>> For example QuantumGIS doesn't work if the layer has only one primary 
>>>> key, there are other application that don't work like qGis?
>>>>     
>>>>         
>>> QuantumGIS has the additional oddity that the primary key has to be of
>>> type int4 - int8 or serial (which is optimal for primary keys) is not
>>> accepted, AFAIK.
>>>
>>> For read-only access, you can try MezoGIS which makes very little
>>> assumptions about your database, and you can specify layers via
>>> arbitrary queries.
>>>
>>>
>>> Regards,
>>> Markus
>>>
>>>   
>>>       
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>     
>
>   




More information about the postgis-users mailing list