Eureka, sort of. Re: [GRASS-user] Help: Completely confused about multi-layered vectors trying to import TIGER/Line files

Tom Russo russo at bogodyn.org
Fri Feb 29 12:28:26 EST 2008


On Fri, Feb 29, 2008 at 08:45:42AM -0700, we recorded a bogon-computron collision of the <russo at bogodyn.org> flavor, containing:
[line after line of prose detailing problems importing TIGER polygons]

> It seems that the only thing to do is a table join, which requires the
> use of something other than the DBF driver, as Michael points out (I have
> been using sqlite for that reason).  Perhaps doing a table join of 
> AreaLandmarks to PIP using the POLYID key, then selecting with v.extract only 
> those features that have non-null values in AreaLandmarks attributes will get 
> me what I want.  I will try that next.


Yes, this worked, after a fashion.

Turns out that v.db.join wouldn't work for me --- there is a bug (I just added
it to the tracker) when trying to use that script on a vector with more than
one layer.  And even with the bug fixed it doesn't work on the TIGER data
because in fact there are MODULE and FILE and CENID fields that are shared
between tables that conflict with a simple table join.  v.db.join complains
angrily when it encounters duplicate fields that aren't the one on which the
join is being done, and exits without doing the job.

HOWEVER, I was able to force things to work in a ham-fisted way, by
cribbing off of the steps that v.db.join is taking --- it's just a
shell script that does a bunch of v.db.addcol and SQL update queries.
Here's how I did it for the sake of completing the record:

#Import the tiger lines and centroids:
v.in.ogr dsn=~/TIGER/BC_TGR layer=CompleteChain,PIP type=boundary,centroid output=t35001_all snap=-1

#Import the AreaLandmarks table:
db.in.ogr dsn=~/TIGER/BC_TGR/ db_table=AreaLandmarks output=t35001_AreaLandmarks
#And the landmarks:
db.in.ogr dsn=~/TIGER/BC_TGR/ db_table=Landmarks output=t35001_Landmarks

#Copy the file so I don't have to do that import again when I mess up:
g.copy vect=t35001_all,test35001_all

# Manually join the tables, copying the technique from the v.db.join script:
#First get the column names and types:
COLLIST=`db.describe  -c table=t35001_AreaLandmarks | grep '^Column ' | cut -d':' -f2`
COLTYPES="`db.describe -c table=t35001_AreaLandmarks | grep '^Column ' | cut -d':' -f3 | tr -s ' ' '_'`"
#Now do v.db.addcol and SQL UPDATE commands to join the tables, skipping
# common columns we already have in the original table:
for col in $COLLIST 
do
  if [ $col != "FILE" -a $col != "MODULE" -a $col != "CENID" -a $col != POLYID ]
  then
    v.db.addcol map=test35001_all layer=2 col="$col `echo $COLTYPES | cut -d' ' -f$i | tr -s '_' ' '`"
    echo "UPDATE test35001_all_2 SET $col=(SELECT $col FROM t35001_AreaLandmarks WHERE test35001_all_2.POLYID=t35001_AreaLandmarks.POLYID)" | db.execute
  fi
  i=`echo $i+1|bc`
done

# Now extract only those areas that have non-null LAND values:
v.extract input=test35001_all layer=2 output=t35001_areal where="LAND IS NOT NULL" type=area


# and join the Landmarks table the same way:
COLLIST=`db.describe  -c table=t35001_Landmarks | grep '^Column ' | cut -d':' -f2`
COLTYPES="`db.describe -c table=t35001_Landmarks | grep '^Column ' | cut -d':' -f3 | tr -s ' ' '_'`"
for col in $COLLIST 
do
  if [ $col != "FILE" -a $col != "MODULE" -a $col != "LAND"]
  then
    v.db.addcol map=t35001_areal layer=2 col="$col `echo $COLTYPES | cut -d' ' -f$i | tr -s '_' ' '`"
    echo "UPDATE t35001_areal SET $col=(SELECT $col FROM t35001_Landmarks WHERE t35001_areal.LAND=t35001_Landmarks.LAND)" | db.execute
  fi
  i=`echo $i+1|bc`
done

#Display it:
d.vect t35001_areal fcolor=red layer=2
# Query it:
d.what.vect

This shows me only those areas that have landmark attributes, and gives me
the names and CFCC (census feature class codes) for them.

Next step is the comparatively simple act of doing the table join between
the centroids PIP table and the Polygon table in the original vector layer
so that those attributes are attached.  To do this I probably want to script
the join in a more general way --- v.db.join won't cut it because it'll 
die every time it encounters identical fields between tables that aren't the
ones being merged on.

In fact, to be robust, my script for handling TIGER files should probably 
make sure that MODULE, FILE, CENID etc. actually match when merging on POLYID
instead of just assuming so.

The fact that v.db.join actually just does a bunch of v.db.addcol and
"UPDATE table SET column=(SELECT....)" SQL queries in a "for" loop
makes me wonder why it's not legal to use it on DBF files.  I'm pretty
sure that both operations are valid on DBF.  No SQL JOIN operation is
really being applied.

None of this really helps me understand GRASS layers better --- my
approach to getting the TIGER area landmarks here skips the whole
layer issue and just makes a new vector, thereby copying geometry and
adding tables instead of just re-using it in a clever way.  I'd still
like to know how to do that, but I'll take what I've gotten.

-- 
Tom Russo    KM5VY   SAR502   DM64ux          http://www.swcp.com/~russo/
Tijeras, NM  QRPL#1592 K2#398  SOC#236 AHTB#1 http://kevan.org/brain.cgi?DDTNM
"And, isn't sanity really just a one-trick pony anyway? I mean all you get is
 one trick, rational thinking, but when you're good and crazy, oooh, oooh,
 oooh, the sky is the limit!"  --- The Tick


More information about the grass-user mailing list