[postgis-users] Address_standardizer - changing the order of input values mapping to a common output value
Stephen Woodbridge
stephenwoodbridge37 at gmail.com
Mon Mar 15 10:57:54 PDT 2021
On 3/15/2021 1:04 PM, Grant Orr wrote:
>
> I have an address that comes in with “11 RTE” and I want to map it in
> the rules to STREET as “RTE 11”
>
> Is there support for this type of mapping?
>
Yes and no! In the Postgis version of the address standandizer it is
very difficult. I wrote Perl scripts to convert the rules to human
readable text and back to the rules format and you can find them here:
https://github.com/woodbri/imaptools.com/blob/master/tools/scripts/rules2txt
https://github.com/woodbri/imaptools.com/blob/master/tools/scripts/txt2rules
https://github.com/woodbri/imaptools.com/blob/master/tools/scripts/pagc-data-psql
The process is whole other thing because it is somewhat unstable. But
here is the idea:
1. convert rules to txt
2. minimize changes in the rules - this is the unstable part
3. ideally change the lexicon instead of rules
4. convert text back to rules if changed
5. reload rules and lexicon if changed
6. RE-STANDARDIZE the whole reference data set if you made changes
7. test, test, test
Another option would be to look at my newer address standardize here:
https://github.com/woodbri/address-standardizer
this was written in C++ and designed to be international and human
friendly. Rules and lexicons are easy to work with and change. I was
hoping this would get integrated into postgis because the code is better
documented and easier to work with, but I have retired and plan to do
some traveling so I will not be doing the integration, but I would
support someone else integrating it if there is interest.
Best regards,
-Steve W
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
More information about the postgis-users
mailing list