[postgis-users] Fuzzy Address Matching - PostgreSql equivalent to FuzzyStringComparer using Python difflib module
Stephen Woodbridge
stephenwoodbridge37 at gmail.com
Mon May 11 09:44:40 PDT 2020
Also as mentioned before, this is a rewrite of the address standardizer
and it is much easier to use and customize and sample config files for
25 countries:
https://github.com/woodbri/address-standardizer
If you are trying to write a geocoder I have open sourced the imaptools
geocoder which you can find here:
https://github.com/woodbri/imaptools.com
Start by reading these:
https://github.com/woodbri/imaptools.com/blob/master/README-geocoder-design.md
https://github.com/woodbri/imaptools.com/blob/master/README-geocoder.md
Sorry things are a little chaotic because I just dumped all my code up
here, but I have been documenting stuff in README files and trying to
reorg things to make more sense.
-Steve
On 5/11/2020 12:30 PM, Paul Ramsey wrote:
> It's not an easy problem. There is no one guaranteed magic bullet.
>
> Use the address_standardizer extension, particularly for north american addressing.
>
> https://postgis.net/docs/postgis_installation.html#installing_pagc_address_standardizer
>
> Or use an ML trained standardizer like this one.
>
> https://github.com/pramsey/pgsql-postal
>
> Or gate out to a geocoding service using a web service call.
>
> https://docs.google.com/presentation/d/1Fgc_2dzWAzT--HdMEiWj2fFLJNnpxPXmnYXx9Js3xjE/edit
>
> To handball some fuzzy stuff, use the functions in the postgresql contrib module,
>
> create extension fuzzystrmatch;
>
> The python utility is really just using different ratios of string length and levenstein distance, it ain't rocket science.
>
> P.
>
>
>> On May 11, 2020, at 9:24 AM, Shaozhong SHI <shishaozhong at gmail.com> wrote:
>>
>> Hello,
>>
>> I got a few questions as follows:
>>
>> 1. Which one is the best way for Fuzzy Address Matching?
>>
>> 2. FME FuzzyStringComparer uses Python difflib module. Which one in Postgres is equivalent or similar to it?
>>
>> 3. Often, addresses collected by different people may well be correct. But, there may be typing errors, or addresses are composed not in a consistent manner.
>>
>> For instance, South Great Avenue, A City, Planet Earth may be put down as the following:
>>
>> S. Great Aveue, City A, Earth Planet
>> Great Avene South, A City, Earth Planet
>> Great Avenue S, A City, Planet Earth
>>
>> Surely, there would be solutions to deal with this problem.
>>
>> Can anyone enlighten me?
>>
>> Regards,
>>
>> Shao
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list