[GRASS-user] Re: GRASS-user] Help: Completely confused about multi-layered vectors trying to import TIGER/Line files

Tom Russo russo at bogodyn.org
Thu Feb 28 14:08:52 EST 2008


On Thu, Feb 28, 2008 at 10:38:00AM -0700, we recorded a bogon-computron collision of the <michael.barton at asu.edu> flavor, containing:
> 
> On Feb 28, 2008, at 8:57 AM, grass-user-request at lists.osgeo.org wrote:
> 
>> Date: Thu, 28 Feb 2008 08:39:29 -0700
>> From: Tom Russo <russo at bogodyn.org>
>> Subject: [GRASS-user] Help: Completely confused about multi-layered
>> 	vectors	trying to import TIGER/Line files
>> To: grass-user at lists.osgeo.org
>> Message-ID: <20080228153929.GA37583 at bogodyn.org>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> I have been trying to wrap my brain around "multi-layered" GRASS vectors 
>> and
>> have only succeeded in wrapping my brain into knots.  Perhaps someone here 
>> with
>> a solid understanding of this stuff can help me.
>> 
>> I'm trying to figure out how to import TIGER/Line data and actually get 
>> the
>> attributes of areas pulled in.  This is trouble.
>> 

Michael:

Thank you for answering, but your answer has either highlighted how
poorly I expressed my question, or thrown into sharper relief how
confused I am about this.  Some of what you say below was already
clear to me, but there's a big gap between "Each vector file (and
object) can have more than one key field to link it to an attribute
table," (which I knew), "Each key (AKA 'cat in layer #') can link to
a line/record in an attribute table (which also must have an 
identical integer key field, that doesn't HAVE to be called "cat", but
often is)."(which I also knew), and the thing I really want to know --- and
it is the latter that I think I haven't explained well.

> The 'layers' you mention here are 2 very different beasts.
> 
> First OGR. The underlying concept is that some data (e.g., CAD) come in a 
> file that has multiple 'layers' of vectors that may (or may not) have 
> different associated data. I don't know TIGER files, so I don't know if 
> they come this way or not. 

I'll clarify, then, because that's not exactly how TIGER is layed out.
There are a number of vectors, and each is related to one or more
tables of attributes, but OGR doesn't make the connection itself --- there 
are simply common attributes between tables that one is left to associate
onesself.

The TIGER data comes in a number of files, each containing a series of
records.  Each file has a different record type.  There is a record
type that defines nodes in "Complete Chains", a record type for "shape
points" that define the vertices (between the nodes) of the chains, a
record type for Polygon Internal Points (centroids), a record for
polygon attributes, a record for linking chains to polygons (with
left/right polygon ids) etc.

When unpacked into a directory, OGR views the collection as a set of 
"layers" (I HATE that this word is used in so many different ways).  A quick
"ogrinfo" shows:

INFO: Open of `/users/russo/TIGER/BC_TGR'
      using driver `TIGER' successful.

Layer name: CompleteChain
Geometry: Line String
Feature Count: 58942
Extent: (-107.196170, 34.869024) - (-106.149575, 35.219639)
Layer SRS WKT: [...]
MODULE: String (8.0)
TLID: Integer (10.0)     <- This is a Line ID to link to other tables
[... tons more attributes for linear features...]

Layer name: AltName         <--- table of alternate feature names in addition
                                 to the one in CompleteChain 
Geometry: None
Feature Count: 6026
Layer SRS WKT:[...]
MODULE: String (8.0)
TLID: Integer (10.0)          <--- this one could be used to relate the
                                   alternate names back to linear features
RTSQ: Integer (3.0)
FEAT: IntegerList (8.0)       <--- and this one links to the next table,
                                   which actually has the names

Layer name: FeatureIds
Geometry: None
Feature Count: 10235
Layer SRS WKT: [...]
MODULE: String (8.0)
FILE: Integer (5.0)
FEAT: Integer (8.0)           <--- linking column for AltName table
FEDIRP: String (2.0)
FENAME: String (30.0)
FETYPE: String (4.0)
FEDIRS: String (2.0)

Layer name: ZipCodes
Geometry: None
Feature Count: 1827
Layer SRS WKT:[...]
MODULE: String (8.0)
TLID: Integer (10.0)          <---- links back to CompleteChain
RTSQ: Integer (3.0)
[...]

Layer name: Landmarks
Geometry: Point
Feature Count: 448
Extent: (-107.119811, 34.889113) - (-106.232580, 35.205106)
Layer SRS WKT:
GEOGCS["NAD83",
    DATUM["North_American_Datum_1983",
        SPHEROID["GRS 1980",6378137,298.257222101]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433]]
MODULE: String (8.0)
FILE: Integer (5.0)
LAND: Integer (10.0)         <------ linking column to AreaLandmarks
SOURCE: String (1.0)
CFCC: String (3.0)
LANAME: String (30.0)
LALONG: Integer (10.0)
LALAT: Integer (9.0)
FILLER: String (1.0)

Layer name: AreaLandmarks
Geometry: None
Feature Count: 1292
Layer SRS WKT:
GEOGCS["NAD83",
    DATUM["North_American_Datum_1983",
        SPHEROID["GRS 1980",6378137,298.257222101]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433]]
MODULE: String (8.0)
FILE: String (5.0)
STATE: Integer (2.0)
COUNTY: Integer (3.0)
CENID: String (5.0)
POLYID: Integer (10.0)      <----- Linking column to PIP
LAND: Integer (10.0)        <----- Linking column to Landmarks

Layer name: Polygon
Geometry: None
Feature Count: 18597
Layer SRS WKT:
GEOGCS["NAD83",
    DATUM["North_American_Datum_1983",
        SPHEROID["GRS 1980",6378137,298.257222101]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433]]
MODULE: String (8.0)
FILE: Integer (5.0)
CENID: String (5.0)
POLYID: Integer (10.0)     <------ Linking column to PIP
[tons more attributes]

[... a whole lot more "Geometry: none" tables irrelevant to the point...]

Layer name: PIP
Geometry: Point
Feature Count: 18597
Extent: (-107.188495, 34.870089) - (-106.149778, 35.218201)
Layer SRS WKT:
GEOGCS["NAD83",
    DATUM["North_American_Datum_1983",
        SPHEROID["GRS 1980",6378137,298.257222101]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433]]
MODULE: String (8.0)
FILE: Integer (5.0)
CENID: String (5.0)
POLYID: Integer (10.0)          <---- linking column to a bunch of others.
POLYLONG: Integer (10.0)
POLYLAT: Integer (9.0)
WATER: Integer (1.0)

This is an intertwined MESS of data, and none of the intertwining is done
through OGR.

By issuing the original v.in.ogr command:

  v.in.ogr  dsn=~/TIGER/BC_TGR layer=CompleteChain,PIP output=t56015_all \
                     type=boundary,centroid snap=-1

(as taken directly from the v.in.ogr man page) I pulled in the linear
features (CompleteChain, which includes all the boundaris and
non-boundary features) and centroids (PolygonInternalPoint, PIP) with
their associated attributes *from their own tables*.  But as I
mentioned, TIGER is more of a database in normal form, so there are
all sorts of interlinked tables with common keys.  v.in.ogr (and OGR
itself) does not follow the links, so it's up to me to get them linked
up somehow.

> Now GRASS layers. A disclaimer from me: I think that "layer" is a confusing 
> term to use here. 

No argument here.  I hate that the word "layer" is used in about three
incompatible ways: to denote a vector coverage (as it's used in most
GIS literature), as one of a set of tables linked to a vector coverage
(in GRASS), and as either a table or a vector element of a collection
of tables and vectors (in OGR).

> Each vector file (and 
> object) can have more than one key field to link it to an attribute table. 
> These key fields are called "cat" (short for category) and are always 
> integer. So, a vector can have different integer keys attached to a single 
> object. But instead of calling these cat1, cat2, etc, they are called '
> cat in layer 1', 'cat in layer 2', etc. Each key (AKA 'cat in layer #') can 
> link to a line/record in an attribute table (which also must have an 
> identical integer key field, that doesn't HAVE to be called "cat", but 
> often is).

I understand that part.  What I am not understanding is how to get the right
categories to attach to the right elements of these extra database columns.

Here's a concrete example.  The TIGER/Line file for this can be
downloaded (sometime before 2 days are up) from this temporary FTP
site: ftp://ftp.swcp.com/pub/tmp/russo/TGR35001.ZIP.  The file unzips
to all the various records files, and if unpacked into its own
directory can be imported into a latitude/longitude GRASS location
with the sort of v.in.ogr command I gave above.

This TIGER/Line collection has a table with no associated geometry,
Landmarks, that has an entry (from ogrinfo -al output):

OGRFeature(Landmarks):15
  MODULE (String) = TGR35001
  FILE (Integer) = 35001
  LAND (Integer) = 15
  SOURCE (String) = J
  CFCC (String) = D10
  LANAME (String) = Kirtland Air Force Base
  LALONG (Integer) = (null)
  LALAT (Integer) = (null)
  FILLER (String) = (null)

There are a number of rows in the AreaLandmarks table that relate back to
this single record through the LAND attribute:

OGRFeature(AreaLandmarks):154
  MODULE (String) = TGR35001
  FILE (String) = 35001
  STATE (Integer) = 35
  COUNTY (Integer) = 1
  CENID (String) = c4588
  POLYID (Integer) = 18750
  LAND (Integer) = 15

OGRFeature(AreaLandmarks):155
  MODULE (String) = TGR35001
  FILE (String) = 35001
  STATE (Integer) = 35
  COUNTY (Integer) = 1
  CENID (String) = c4588
  POLYID (Integer) = 18749
  LAND (Integer) = 15
[lots more]

that relate back to PIP records through the POLYID field.  Those PIP records
are:

OGRFeature(PIP):18594
  MODULE (String) = TGR35001
  FILE (Integer) = 35001
  CENID (String) = c4588
  POLYID (Integer) = 18750
  POLYLONG (Integer) = -106551831
  POLYLAT (Integer) = 35060558
  WATER (Integer) = (null)
  POINT (-106.551831000000007 35.060558)

OGRFeature(PIP):18593
  MODULE (String) = TGR35001
  FILE (Integer) = 35001
  CENID (String) = c4588
  POLYID (Integer) = 18749
  POLYLONG (Integer) = -106546870
  POLYLAT (Integer) = 35049120
  WATER (Integer) = (null)
  POINT (-106.546869999999998 35.049120000000002)

[etc.]

and these PIP records are properly attached to centroids in my GRASS vector:

 > v.info -c layer=2 map=t35001_all
Displaying column types/names for database connection of layer 2:
INTEGER|cat
TEXT|MODULE
INTEGER|FILE
TEXT|CENID
INTEGER|POLYID
INTEGER|POLYLONG
INTEGER|POLYLAT
INTEGER|WATER

so somewhere there is a centroid with some category number that has
POLYID 18749, which ultimately could be associated with AreaLandmark
feature 155 and thence (through LAND attribute 15) to Landmark feature 15 and
the name "Kirtland Air Force Base"

What I *want* to accomplish is to produce something that I can display
and query that represents the collection of AreaLandmarks, which is a
subset of the areas initially imported.  I should be able to do a
"d.vect somevector layer=somelayer" and see only those polygons that
have AreaLandmarks attributes, and be able to use d.what.vect to click
on those polygons and get the attributes (presumably I'd do a table
join between the AreaLandmarks table and Landmarks table so that
things like the landmark's name and feature type are all in one table
not two).

My assumption is that the key concept I am missing is that there must
be a way to select, based on records of AreaLandmarks, a subset of
vector elements from the full imported collection of areas (whose
POLYID attribute is already stored in the table attached to Layer 2 of the
vector), assign them new categories for a layer 3, relate those new
categories to rows of the AreaLandmarks table, and finally attach the
AreaLandmarks table to the new layer through its category values.

So my question is how do I do that?


I imagine there's some way to do an extraction with v.extract and a
where clause to create a vector of only those areas with POLID
attributes that appear in the AreaLandmarks table... I hadn't thought
about that yet.  I'm not sure I can craft the WHERE clause for
v.extract that would reference a table that isn't attached to the
vector yet, though.

> However, once you get the data into GRASS, it is  
>possible to "upload" data from one attribute table (linked to layer 2,  
>for example) into another attribute table (linked to layer 1, for  
>example).

I'm sure it's possible, but I still don't understand how to do it in this case.

-- 
Tom Russo    KM5VY   SAR502   DM64ux          http://www.swcp.com/~russo/
Tijeras, NM  QRPL#1592 K2#398  SOC#236 AHTB#1 http://kevan.org/brain.cgi?DDTNM
"And, isn't sanity really just a one-trick pony anyway? I mean all you get is
 one trick, rational thinking, but when you're good and crazy, oooh, oooh,
 oooh, the sky is the limit!"  --- The Tick


More information about the grass-user mailing list