[postgis-users] Fuzzy Address Matching - PostgreSql equivalent to FuzzyStringComparer using Python difflib module

Stephen Woodbridge stephenwoodbridge37 at gmail.com
Mon May 11 09:44:40 PDT 2020


Also as mentioned before, this is a rewrite of the address standardizer 
and it is much easier to use and customize and sample config files for 
25 countries:

https://github.com/woodbri/address-standardizer

If you are trying to write a geocoder I have open sourced the imaptools 
geocoder which you can find here:

https://github.com/woodbri/imaptools.com

Start by reading these:
https://github.com/woodbri/imaptools.com/blob/master/README-geocoder-design.md
https://github.com/woodbri/imaptools.com/blob/master/README-geocoder.md

Sorry things are a little chaotic because I just dumped all my code up 
here, but I have been documenting stuff in README files and trying to 
reorg things to make more sense.

-Steve

On 5/11/2020 12:30 PM, Paul Ramsey wrote:
> It's not an easy problem. There is no one guaranteed magic bullet.
>
> Use the address_standardizer extension, particularly for north american addressing.
>
>    https://postgis.net/docs/postgis_installation.html#installing_pagc_address_standardizer
>
> Or use an ML trained standardizer like this one.
>
>    https://github.com/pramsey/pgsql-postal
>
> Or gate out to a geocoding service using a web service call.
>
>    https://docs.google.com/presentation/d/1Fgc_2dzWAzT--HdMEiWj2fFLJNnpxPXmnYXx9Js3xjE/edit
>
> To handball some fuzzy stuff, use the functions in the postgresql contrib module,
>
>    create extension fuzzystrmatch;
>
> The python utility is really just using different ratios of string length and levenstein distance, it ain't rocket science.
>
> P.
>
>
>> On May 11, 2020, at 9:24 AM, Shaozhong SHI <shishaozhong at gmail.com> wrote:
>>
>> Hello,
>>
>> I got a few questions as follows:
>>
>> 1.  Which one is the best way for Fuzzy Address Matching?
>>
>> 2.  FME FuzzyStringComparer uses  Python difflib module.  Which one in Postgres is equivalent or similar to it?
>>
>> 3.  Often, addresses collected by different people may well be correct.  But, there may be typing errors, or addresses are composed not in a consistent manner.
>>
>> For instance, South Great Avenue, A City, Planet Earth may be put down as the following:
>>
>> S. Great Aveue, City A, Earth Planet
>> Great Avene South, A City, Earth Planet
>> Great Avenue S, A City, Planet Earth
>>
>> Surely, there would be solutions to deal with this problem.
>>
>> Can anyone enlighten me?
>>
>> Regards,
>>
>> Shao
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users



More information about the postgis-users mailing list