[postgis-devel] PAGC Address Standardizer some thoughts on how toorganize

maplabs at light42.com maplabs at light42.com
Wed Jul 2 20:22:54 PDT 2014

On Wed, 2 Jul 2014 22:42:48 -0400, Paragon Corporation <lr at pcorp.us> wrote:
I just forked the PAGC address standardizer into PostGIS trunk for release
> as part of PostGIS 2.2

  svn checkout  just now did get code, so thats good  :-)

> 1) Create folder in extensions of our repo and move the address_standardizer
> extension files to their
> I'd still like it to be able to be built separately if people wish (similar
> to how we have liblwgeom I think) and my only reservation with breaking out
> like this is that it makes it less compact. 

  details aside, being able to build this seperately     BIG +1 !
I have definitely used this since Denver 2011 (where I attended the 
PAGC talk and met various principals )
The Address Standardizer by itself, without any TIGER geocoder, is 
quite valuable. 
I do appreciate the effort in this library and have said so to Steve in 
the past

> 2) Beef up the documentation -- right now all we have is how to install it
> in our install section of manual (and that of course needs to be update with
> new link now that its part of our repo)
> http://postgis.net/docs/manual-dev/postgis_installation.html#installing_pagc
> _address_standardizer So I'm going to add an additional .xml 
> (separate from tiger and install,
> explaining all the nuances of the lexer / rule/ parser files)

  well, this depends a lot on how the decomposition of libs turns out
as referenced in the next sections below

> 3) Before release, I'd like to put logic in the configure.ac so we do the
> same checks and build if all dependencies are available and flag for pcre
> library.  Right now to build I just add to my cppflags and shlib_link. 
>  This I imagine I'll need help with since the configure.ac script is pretty
> alien to me. 

  no matter what happens, PCRE and perl Regexp::Assemble are 
definitely required for this

> 4) Build separate extensions for the custom gaz/lex/rules currently present
> and add more. Right now to run the packaged dictionaries you need to run the
> lex,gaz,rules.sql files which is cumbersome from a newbie stand-point. 
> This one I'm actually thinking just rolling the current one in the base
> extension and then having extensions for custom ones. Since at least US
> people will just use the base one or if they are using tiger geocoder the
> tiger geocoder one already packaged with tiger geocoder extension. 

  this is where things get muddy ... 
Like so many software projects, a broad generalized archtecture ends up 
covering a 
common use case, and the rest is then in the way or collects dust as 
focus narrows. 
It *is* great to have a generalized address parsing engine.. but how 
this lib got here is,
its been difficult to modernize and put sufficient time into a small 
niche utility - Steve told me so.. 

A "pragmatic" move would be to tightly configure the lex/gaz/etc to the 
TIGER Geocoder
and ship it.. but, not using the capacity of the lib. On the other 
hand, if the generalized,
multinational promise is pursued, who is going to build it out? Where 
are the OSM people ?
I am interested sure but this is dense going.. Steve and Regina but are 
there enough hands ?
no clear answers here... 

> 5) this one I'm still thinking about because it'll be a major breaking
> change -- and that would be just to have current tiger geocoder require
> address_standardizer and swap out the norm_addy object with the
> address_standardize std_address one. But that requires a bit of rework and
> assurance that package maintainers can build address_standardizer without
> too much fuss. 

The TIGER Geocoder treads the line between super-useful and super-spaghetti. 
Shoot the messanger if you like, but at least I have to umph to say it.. 
I am constantly amazed at robe2's relentless productivity and the TIGER 
Geocoder project
is a fruit of that, warts and all.. I use it and I have written about it.. 

My personal take is -- change the TIGER Geocoder for 2.2+ and break 
compatability .. 
Whatever is convenient.. very much unlike the overall PostGIS project, 
there are
few if any  'production systems' depending on the details, and damn 
them anyway if they whine

> Thoughts?

  I honestly try to contribute in small ways.. I hope this email is 

> Thanks,
> Regina

Brian M Hamlin
OSGeo California Chapter

More information about the postgis-devel mailing list