[postgis-devel] [PostGIS] #1991: geocode really slow on PostgreSQL 9.2

PostGIS trac at osgeo.org
Tue Sep 11 04:24:32 PDT 2012


#1991: geocode really slow on PostgreSQL 9.2
----------------------------+-----------------------------------------------
 Reporter:  robe            |       Owner:  robe         
     Type:  defect          |      Status:  new          
 Priority:  medium          |   Milestone:  PostGIS 2.0.2
Component:  tiger geocoder  |     Version:  trunk        
 Keywords:                  |  
----------------------------+-----------------------------------------------
 I'm not sure if this is just something I'm missing in settings with the
 new PostgreSQL or a fundamental issue with the new PostgreSQL planner.

 I have 3 parallel windows PostgreSQL instances -- 8.4 32-bit running
 PostGIS 1.5 (on older clunky 32-bit windows 2003 box)

 on new Windows 2008 R2 same box: 9.1 64-bit running PostGIS 2.1.0 SVN, 9.2
 64-bit running PostGIS 2.1.0 SVN.

 On the 9.1 and 8.4, my MA data standard geocodes even with supplementary
 augmented parcel check completes in about 30-200ms.

 On 9.2 instance with same dataset.  It takes 8-30 seconds.  Yap.

 I compared the plans and it seems 9.2 is not using a bitmap index scan for
 this particular query I have.  I have narrowed it to this construct.


 {{{
 -- 10 seconds
  WITH a AS
         ( SELECT *
                 FROM (SELECT f.*, ad.side, ad.zip, ad.fromhn, ad.tohn,
                                         RANK() OVER(ORDER BY
 diff_zip(ad.zip,'02124') +  CASE WHEN lower(f.name) = lower('Evans') THEN
 0 ELSE levenshtein_ignore_case(f.name, lower('Evans') )  END +
 levenshtein_ignore_case(f.fullname, lower('Evans' || ' ' ||
 COALESCE('St','')) )
                                                 + CASE WHEN
 (greatest_hn(ad.fromhn,ad.tohn) % 2)::integer = ('22' % 2)::integer THEN 0
 ELSE 1 END
                                                 + CASE WHEN '22'::integer
 BETWEEN least_hn(ad.fromhn,ad.tohn) AND greatest_hn(ad.fromhn, ad.tohn)
                                                         THEN 0 ELSE 4 END
                                                         + CASE WHEN
 lower('St') = lower(f.suftypabrv) OR lower('St') = lower(f.pretypabrv)
 THEN 0 ELSE 1 END
                                                         +
 rate_attributes(NULL, f.predirabrv,    'Evans',  f.name , 'St',
 suftypabrv , NULL,    sufdirabrv, prequalabr)
                                                         )
                                                 As rank
                                 FROM featnames As f INNER JOIN addr As ad
 ON (f.tlid = ad.tlid)
                     WHERE '25' = f.statefp AND '25' = ad.statefp
                          AND lower(f.name) = lower('Evans')     AND (
 ad.zip = ANY('{02123,02124,02125}'::varchar[]) )   ) AS foo ORDER BY rank
 LIMIT 3 )
         SELECT   a.tlid, a.fullname, a.name, a.predirabrv, a.suftypabrv,
 a.sufdirabrv, a.prequalabr, a.pretypabrv,
                 b.the_geom, tfidr, tfidl,
                 a.side ,
                 a.fromhn,
                 a.tohn,
                 a.zip,
                 p.name as place

                 FROM  a INNER JOIN edges As b ON (a.statefp = b.statefp
 AND a.tlid = b.tlid  )
                     INNER JOIN faces AS f ON ('25' = f.statefp AND (
 (b.tfidl = f.tfid AND a.side = 'L') OR (b.tfidr = f.tfid AND a.side = 'R'
 ) ))
                     INNER JOIN place p ON ('25' = p.statefp AND f.placefp
 = p.placefp )
                 WHERE a.statefp = '25'  AND  b.statefp = '25' ;
 }}}

 Which can be worked around by switching to this:

 {{{
 -- 92 ms --
  WITH a AS
         ( SELECT *
                 FROM (SELECT f.*, ad.side, ad.zip, ad.fromhn, ad.tohn,
                                         RANK() OVER(ORDER BY
 diff_zip(ad.zip,'02124') +  CASE WHEN lower(f.name) = lower('Evans') THEN
 0 ELSE levenshtein_ignore_case(f.name, lower('Evans') )  END +
 levenshtein_ignore_case(f.fullname, lower('Evans' || ' ' ||
 COALESCE('St','')) )
                                                 + CASE WHEN
 (greatest_hn(ad.fromhn,ad.tohn) % 2)::integer = ('22' % 2)::integer THEN 0
 ELSE 1 END
                                                 + CASE WHEN '22'::integer
 BETWEEN least_hn(ad.fromhn,ad.tohn) AND greatest_hn(ad.fromhn, ad.tohn)
                                                         THEN 0 ELSE 4 END
                                                         + CASE WHEN
 lower('St') = lower(f.suftypabrv) OR lower('St') = lower(f.pretypabrv)
 THEN 0 ELSE 1 END
                                                         +
 rate_attributes(NULL, f.predirabrv,    'Evans',  f.name , 'St',
 suftypabrv , NULL,    sufdirabrv, prequalabr)
                                                         )
                                                 As rank
                                 FROM featnames As f INNER JOIN addr As ad
 ON (f.tlid = ad.tlid)
                     WHERE '25' = f.statefp AND '25' = ad.statefp
                          AND lower(f.name) = lower('Evans')     AND (
 ad.zip = ANY('{02123,02124,02125}'::varchar[]) )   ) AS foo ORDER BY rank
 LIMIT 3 )
         SELECT   a.tlid, a.fullname, a.name, a.predirabrv, a.suftypabrv,
 a.sufdirabrv, a.prequalabr, a.pretypabrv,
                 b.the_geom, tfidr, tfidl,
                 a.side ,
                 a.fromhn,
                 a.tohn,
                 a.zip,
                 p.name as place

                 FROM  a INNER JOIN edges As b ON (a.statefp = b.statefp
 AND a.tlid = b.tlid  )
                     INNER JOIN faces AS f ON ('25' = f.statefp AND (
 (b.tfidl = f.tfid ) OR (b.tfidr = f.tfid ) ))
                     INNER JOIN place p ON ('25' = p.statefp AND f.placefp
 = p.placefp )
                 WHERE a.statefp = '25'  AND  b.statefp = '25' AND
                 ( (b.tfidl = f.tfid AND a.side = 'L') OR (b.tfidr = f.tfid
 AND a.side = 'R' ) );
 }}}

 I'll encode this revision if I conclude it's not just something screwy
 with my setup and also report it up the chain if I conclude it is a
 PostgreSQL 9.2 issue, which I suspect it is.

 Note: there is nothing spatial in the above query, so I think PostGIS is
 off the hook.

-- 
Ticket URL: <https://trac.osgeo.org/postgis/ticket/1991>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.


More information about the postgis-devel mailing list