[Geodata] TIGER/Line 2010 problem
John P. Linderman
jpl at research.att.com
Wed Dec 7 08:02:57 EST 2011
If you aren't interested in the 2010 US Census TIGER/Line data,
move on, these aren't the droids you're looking for.
Someone here asked for a list of adjacent US counties, and I figured
the county-oriented TIGER/Line data would be a good source. I started
by running through each county, recording all the face IDs for polygons
in the county, then running through all the edges in the county, ignoring
all those where both the left and right face IDs were local to the county.
For those edges with a non-local face ID, I wrote the edge ID (TLID)
and county number. The plan was to sort these, expecting to find a pair
of entries for each TLID, one for each county in which it was part of the
boundary. Write the county pairs out in both orders, run them through
sort/unique, and you have a list of adjacent counties.
But there were entries that made only one appearance. When an edge
appears at a national boundary, it has an empty face ID as well as
the local face ID. So I left out entries with empty face IDs.
But there were still entries making only one appearance. After
a lot of poking around, and going back and forth with the support
folks at geo.tiger at census.gov, I tracked this down to some edges
having a blank face ID even though they were adjacent to another
county. Here's part of the explanation I received:
I looked at this in detail, and there is no problem evident
other than the one that we already discussed.
Apparently, the Edge shapefiles for state 22 (Louisiana)
were created before the national Benchmark/PDB [product
database] creation was complete, whereas the shapefiles
for state 48 (Texas) were created later, after both
Louisiana and Texas had been loaded into the PDB. [As I
had mentioned in a previous email, this refers to the order
in which we produced files for the states. The first few
states for which we created files, including Louisiana,
are where you will find the problems, as they were created
before adjacent state data were available in the database
from which we created the files. So, the TFIDs in the
adjoining states were not available, and show as 0.
The other states that should be affected are Virginia,
Mississippi, and New Jersey.]
The Edge in question in the Beauregard, LA shapefile (22011)
shows only the LA-side face id:
TFIDL 202498890
TFIDR 0
The Edge in the Newton, TX shapefile (48351) shows both face ids:
TFIDL 202498890
TFIDR 211513365
The user refers to the Face IDs as being "different"
and "inconsistent". Though these descriptions are
mathematically correct (0 is certainly not equal
to 211513365), they are conceptually misleading.
Conceptually speaking, where the Face is populated
(a non-zero value), the values in the two counties
are equal and consistent.
I offered to send them a list of the edges adjacent to another
county, but having an empty face ID, together with the face ID
that should replace the empty one, but, after several weeks,
I have heard nothing, so I guess they are sticking to the
"no problem evident" story.
For those of use wishing to identify county boundaries, this
is an ugly complication. For example, if you only want to
look at Louisiana, but want to distinguish boundaries on the Gulf
from boundaries with other states, you cannot just download the
Louisiana data. You'll also have to download Texas and Mississippi,
because an edge with an empty face ID in Louisiana may be on the
Gulf, or it may abut one of the other states. You can only be sure
an edge is on the Gulf by looking through all the border edges in
the other states, and verifying that the Louisiana edge doesn't
appear among them. Sad, particularly because it appears to be so
easy to correct, but there we are. With luck, some future release
will replace the empty face IDs on state boundaries with the
non-empty face ID from the adjacent state, but, for now, we'll
just have to deal with it. -- jpl
More information about the Geodata
mailing list