[postgis-users] Address_standardizer - changing the order of input values mapping to a common output value

Stephen Woodbridge stephenwoodbridge37 at gmail.com
Mon Mar 15 10:57:54 PDT 2021


On 3/15/2021 1:04 PM, Grant Orr wrote:
>
> I have an address that comes in with “11 RTE” and I want to map it in 
> the rules to STREET as “RTE 11”
>
> Is there support for this type of mapping?
>

Yes and no! In the Postgis version of the address standandizer it is 
very difficult. I wrote Perl scripts to convert the rules to human 
readable text and back to the rules format and you can find them here:

https://github.com/woodbri/imaptools.com/blob/master/tools/scripts/rules2txt
https://github.com/woodbri/imaptools.com/blob/master/tools/scripts/txt2rules
https://github.com/woodbri/imaptools.com/blob/master/tools/scripts/pagc-data-psql

The process is whole other thing because it is somewhat unstable. But 
here is the idea:

1. convert rules to txt
2. minimize changes in the rules - this is the unstable part
3. ideally change the lexicon instead of rules
4. convert text back to rules if changed
5. reload rules and lexicon if changed
6. RE-STANDARDIZE the whole reference data set if you made changes
7. test, test, test

Another option would be to look at my newer address standardize here:

https://github.com/woodbri/address-standardizer

this was written in C++ and designed to be international and human 
friendly. Rules and lexicons are easy to work with and change. I was 
hoping this would get integrated into postgis because the code is better 
documented and easier to work with, but I have retired and plan to do 
some traveling so I will not be doing the integration, but I would 
support someone else integrating it if there is interest.

Best regards,
   -Steve W

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the postgis-users mailing list