[postgis-tickets] [PostGIS] #4655: TIGER 'addrfeat' table definition vs. current 2019 shapefiles

PostGIS trac at osgeo.org
Sun Mar 29 10:03:00 PDT 2020


#4655: TIGER 'addrfeat' table definition vs. current 2019 shapefiles
----------------------+---------------------
 Reporter:  jlrobins  |      Owner:  pramsey
     Type:  defect    |     Status:  new
 Priority:  medium    |  Milestone:
Component:  postgis   |    Version:  3.0.x
 Keywords:            |
----------------------+---------------------
 I'm bootstrapping a PostGIS 3.0.1, PostgreSQL 12.2, TIGER 2019 database
 from scratch against all possible layers for the southeast US. I've met an
 ounce of resistance when trying to load the 'addrfeat' layer for FL, which
 happened to be my first state. The following section of the results of
 {{{
     SELECT loader_generate_script(ARRAY['FL','NC', 'SC', 'GA'],'sh');`
 }}}

 ...

 {{{
 ${PSQL} -c "CREATE TABLE tiger_data.FL_addrfeat(CONSTRAINT pk_FL_addrfeat
 PRIMARY KEY (gid)) INHERITS(tiger.addrfeat);ALTER TABLE
 tiger_data.FL_addrfeat ALTER COLUMN statefp SET DEFAULT '12';"
 for z in *addrfeat*.dbf; do
         ${SHP2PGSQL} -D  -S -D -s 4269 -g the_geom -W "latin1" $z
 tiger_staging.FL_addrfeat | ${PSQL}
         ${PSQL} -c "SELECT loader_load_staged_data(lower('FL_addrfeat'),
 lower('FL_addrfeat'));"
 done
 }}}

 ends up croaking within the `loader_load_staged_data()` call due to a
 mismatch in columns from the shapefile vs. columns in base table
 `addrfeat`:

 {{{
 ALTER TABLE
 Shapefile type: Arc
 Postgis type: LINESTRING[2]
 SET
 SET
 BEGIN
 CREATE TABLE
 ALTER TABLE
                           addgeometrycolumn
 ----------------------------------------------------------------------
  tiger_staging.fl_addrfeat.the_geom SRID:4269 TYPE:LINESTRING DIMS:2
 (1 row)

 COPY 32505
 COMMIT
 ANALYZE
 NOTICE:  INSERT INTO
 tiger_data.fl_addrfeat(aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,rtohn,rtotyp,the_geom,tlid,zipl,zipr)
 SELECT
 aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,road_mtfcc,rtohn,rtotyp,tfidl,tfidr,the_geom,tlid,zipl,zipr
 FROM tiger_staging.fl_addrfeat;
 ERROR:  INSERT has more expressions than target columns
 LINE 1: ...mtyp,road_mtfcc,rtohn,rtotyp,tfidl,tfidr,the_geom,tlid,zipl,...
                                                              ^
 QUERY:  INSERT INTO
 tiger_data.fl_addrfeat(aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,rtohn,rtotyp,the_geom,tlid,zipl,zipr)
 SELECT
 aridl,aridr,edge_mtfcc,fullname,lfromhn,lfromtyp,linearid,ltohn,ltotyp,offsetl,offsetr,parityl,parityr,plus4l,plus4r,rfromhn,rfromtyp,road_mtfcc,rtohn,rtotyp,tfidl,tfidr,the_geom,tlid,zipl,zipr
 FROM tiger_staging.fl_addrfeat;
 CONTEXT:  PL/pgSQL function loader_load_staged_data(text,text,text[]) line
 26 at EXECUTE
 SQL function "loader_load_staged_data" statement 1
 }}}

 shp2pgsql from the shapefile within
 `https://www2.census.gov/geo/tiger/TIGER2019/ADDRFEAT/tl_2019_13001_addrfeat.zip`
 has 3 columns not present in base table `addrfeat`:

 {{{
     CREATE TABLE "tl_2019_12001_addrfeat" (
     ...
     "tfidl" int8,
     "tfidr" int8,
     ...
     "road_mtfcc" varchar(5),
     ...
 }}}

 When I adjust tiger_geocoder.sql.in's definition of `addrfeat` with how I
 see `tfidl`, `tfidr`, and similar concept `edge_mtfcc` modeled:

 {{{
     CREATE TABLE addrfeat
     (
       ...
       tfidl numeric(10),
       tfidr numeric(10),
       ...
       road_mtfcc character varying(5),
       ...
     );
 }}}

 Then initial population proceeds past this point.

 I've not yet explored what an 'mftcc' is, but there are at least some
 differences in road_ vs. edge_ values in FL:

 {{{
 tiger=# select (edge_mtfcc = road_mtfcc) as same_mtfcc_value, count(*)
 from tiger_data.fl_addrfeat group by 1 order by 2 desc;
  same_mtfcc_value |  count
 ------------------+---------
  t                | 1268547
  f                |    2615
 (2 rows)


 tiger=# select gid, edge_mtfcc, road_mtfcc from tiger_data.fl_addrfeat
 where edge_mtfcc != road_mtfcc order by 1, 2, 3 limit 5;
   gid  | edge_mtfcc | road_mtfcc
 -------+------------+------------
   1180 | P0001      | S1400
  16533 | P0001      | S1200
  16534 | P0001      | S1200
  36457 | P0001      | S1200
  36500 | P0001      | S1400
 (5 rows)
 }}}

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/4655>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-tickets mailing list