<div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace">Hey dear list,<br></div><div class="gmail_default" style="font-family:monospace,monospace">I'm currently considering extending postgis geocoder.<br><br>I have two needs: <br> - to work with French data<br></div><div class="gmail_default" style="font-family:monospace,monospace"> - to perform temporal geocoding (you give an addresses <b>and </b>a date, you get geographic position).<br><br></div><div class="gmail_default" style="font-family:monospace,monospace">For the moment I'm trying to understand how the geocoder works with USA data.<br><br></div><div class="gmail_default" style="font-family:monospace,monospace">I imported Massachusetts for a test, and everything seems okay except that my tiger.state table is empty, which prevents geocoding.<br></div><div class="gmail_default" style="font-family:monospace,monospace"><br></div><div class="gmail_default" style="font-family:monospace,monospace">Notwithstanding the import step, and the adress normaliser, it seems to boil down to a large query in the function tiger.geocode_address().<br><br><br></div><div class="gmail_default" style="font-family:monospace,monospace">I would be delighted to have some pointers on the design of the postgis_tiger_geocoder extension.<br></div><div class="gmail_default" style="font-family:monospace,monospace">In perticular, I don't understand the adress standardizer business at all (yeah, I found some doc, no, I don't understand it well)<br></div><div class="gmail_default" style="font-family:monospace,monospace"><br><br></div><div class="gmail_default" style="font-family:monospace,monospace">Many thanks for help,<br></div><div class="gmail_default" style="font-family:monospace,monospace">Cheers,<br></div><div class="gmail_default" style="font-family:monospace,monospace">Rémi-C<br></div><div class="gmail_default" style="font-family:monospace,monospace"><br><br><br><br><br><br>-------------------------------------<br></div><div class="gmail_default" style="font-family:monospace,monospace">WITH a AS<br>(<br>         SELECT   *<br>         FROM     (<br>                             SELECT     f.*,<br>                                        ad.side,<br>                                        ad.zip,<br>                                        ad.fromhn,<br>                                        ad.tohn,<br>                                        Rank() over(ORDER BY Diff_zip(ad.zip,'01104')*2.00 +<br>                                        CASE<br>                                                   WHEN Lower(<a href="http://f.name">f.name</a>) = Lower('FRANKLIN') THEN 0<br>                                                   ELSE Levenshtein_ignore_case(<a href="http://f.name">f.name</a>, Lower('FRANKLIN') )<br>                                        END + Levenshtein_ignore_case(f.fullname, Lower('FRANKLIN'<br>                                                   || ' '<br>                                                   || Coalesce('ST','')) ) +<br>                                        CASE<br>                                                   WHEN (<br>                                                                         Greatest_hn(ad.fromhn,ad.tohn) % 2)::INTEGER = ('1' % 2)::INTEGER THEN 0<br>                                                   ELSE 1<br>                                        END +<br>                                        CASE<br>                                                   WHEN '1'::INTEGER BETWEEN Least_hn(ad.fromhn,ad.tohn) AND        Greatest_hn(ad.fromhn, ad.tohn) THEN 0<br>                                                   ELSE 4<br>                                        END +<br>                                        CASE<br>                                                   WHEN Lower('ST') = Lower(f.suftypabrv)<br>                                                   OR         Lower('ST') = Lower(f.pretypabrv) THEN 0<br>                                                   ELSE 1<br>                                        END + Rate_attributes(NULL, f.predirabrv, 'FRANKLIN', <a href="http://f.name">f.name</a> , 'ST', suftypabrv , NULL, sufdirabrv, prequalabr) ) AS rank<br>                             FROM       tiger.featnames                                                                                                   AS f<br>                             inner join tiger.addr                                                                                                        AS ad<br>                             ON         (<br>                                                   f.tlid = ad.tlid)<br>                             WHERE      '25' = f.statefp<br>                             AND        '25' = ad.statefp<br>                             AND        (<br>                                                   Lower(f.fullname) LIKE (Coalesce(NULL<br>                                                              || ' ','')<br>                                                              || Lower('FRANKLIN')<br>                                                              || '%')::text<br>                                        OR         Lower(<a href="http://f.name">f.name</a>) = Lower('FRANKLIN')<br>                                        OR         Soundex(<a href="http://f.name">f.name</a>) = Soundex('FRANKLIN') )<br>                             AND        (<br>                                                   ad.zip = ANY('{01102,01103,01104,01105,01106}'::VARCHAR[]) ) ) AS foo<br>         ORDER BY rank limit 30 )<br><br>, b AS (<br>                  SELECT     a.tlid,<br>                     a.fullname,<br>                     <a href="http://a.name">a.name</a>,<br>                     a.predirabrv,<br>                     a.suftypabrv,<br>                     a.sufdirabrv,<br>                     a.prequalabr,<br>                     a.pretypabrv,<br>                     b.the_geom,<br>                     tfidr,<br>                     tfidl,<br>                     a.side ,<br>                     a.fromhn,<br>                     a.tohn,<br>                     a.zip,<br>                     <a href="http://p.name">p.name</a> AS place<br>                  FROM       a<br>                  inner join tiger.edges AS b<br>                  ON         (<br>                            a.statefp = b.statefp<br>                     AND        a.tlid = b.tlid )<br>                  inner join tiger.faces AS f<br>                  ON         (<br>                            '25' = f.statefp<br>                     AND        ( (<br>                                      b.tfidl = f.tfid<br>                                   AND        a.side = 'L')<br>                            OR         (<br>                                      b.tfidr = f.tfid<br>                                   AND        a.side = 'R' ) ))<br>                  inner join tiger.place p<br>                  ON         (<br>                            '25' = p.statefp<br>                     AND        f.placefp = p.placefp )<br>                  WHERE      a.statefp = '25'<br>                  AND        b.statefp = '25' ) <br><br>, sub AS  (<br>    SELECT   tlid,<br>           predirabrv,<br>           coalesce(b.prequalabr<br>                || ' ','' )<br>                || <a href="http://b.name">b.name</a> AS fename,<br>           suftypabrv,<br>           sufdirabrv,<br>           fromhn,<br>           tohn,<br>           side,<br>           zip,<br>           rate_attributes(NULL, predirabrv, 'FRANKLIN', <a href="http://b.name">b.name</a> , 'ST', suftypabrv , NULL, sufdirabrv, prequalabr) +<br>           CASE<br>                WHEN '1'::INTEGER IS NULL<br>                OR       b.fromhn IS NULL THEN 20<br>                WHEN '1'::    INTEGER >= least_hn(b.fromhn, b.tohn)<br>                AND      '1'::INTEGER <= greatest_hn(b.fromhn,b.tohn)<br>                AND      (<br>                          '1'::INTEGER % 2) = (to_number(b.fromhn,'99999999') % 2)::INTEGER THEN 0<br>                WHEN '1'::    INTEGER >= least_hn(b.fromhn,b.tohn)<br>                AND      '1'::INTEGER <= greatest_hn(b.fromhn,b.tohn) THEN 2<br>                ELSE ((1.0 - (least_hn('1'::text,least_hn(b.fromhn,b.tohn)::text)::NUMERIC / (greatest(1,greatest_hn('1'::text,greatest_hn(b.fromhn,b.tohn)::text))) )) * 5)::INTEGER + 5<br>           END::INTEGER AS sub_rating,<br>           '1'::INTEGER >= least_hn(b.fromhn,b.tohn)<br>      AND      '1'::INTEGER <= greatest_hn(b.fromhn,b.tohn)<br>      AND      (<br>                '1' % 2)::NUMERIC::INTEGER = (to_number(b.fromhn,'99999999') % 2) AS exact_address,<br>           <a href="http://b.name">b.name</a>,<br>           b.prequalabr,<br>           b.pretypabrv,<br>           b.tfidr,<br>           b.tfidl,<br>           b.the_geom,<br>           b.place<br>      FROM     b<br>      ORDER BY 10 ,<br>           11 DESC limit 20 ) <br><br><br><br>--, foo AS (<br><br>    SELECT DISTINCT<br>         ON (<br>                         sub.predirabrv,sub.fename,coalesce(sub.suftypabrv, sub.pretypabrv) ,sub.sufdirabrv,sub.place,s.stusps,sub.zip) sub.predirabrv AS fedirp,<br>                 sub.fename,<br>                 coalesce(sub.suftypabrv, sub.pretypabrv) AS fetype,<br>                 sub.sufdirabrv                           AS fedirs,<br>                 sub.place ,<br>                 s.stusps                                                                                                                                             AS state,<br>                 sub.zip                                                                                                                                              AS zip,<br>                 interpolate_from_address('1', sub.fromhn, sub.tohn, sub.the_geom, sub.side)                                                                          AS address_geom,<br>                 (sub.sub_rating + least(coalesce(diff_zip('01104' , sub.zip),0), 20)*2.00 + coalesce(levenshtein_ignore_case('SPRINGFIELD', sub.place),5) )::INTEGER AS sub_rating,<br>                 sub.exact_address                                                                                                                                    AS exact_address,<br>                 sub.tohn,<br>                 sub.fromhn<br>         FROM           sub  <br>         join            tiger.state s<br>         ON              (<br>                         '25' = s.statefp)<br>         ORDER BY        1,<br>                 2,<br>                 3,<br>                 4,<br>                 5,<br>                 6,<br>                 7,<br>                 9 limit 20) <br>                                         <br>SELECT   *<br>FROM foo           <br>ORDER BY sub_rating,<br>         exact_address DESC limit 100 <br></div><div class="gmail_default" style="font-family:monospace,monospace">-------------------------------------<br></div></div>