[postgis-devel] [PostGIS] #1991: geocode really slow on PostgreSQL 9.2
PostGIS
trac at osgeo.org
Tue Sep 11 04:24:32 PDT 2012
#1991: geocode really slow on PostgreSQL 9.2
----------------------------+-----------------------------------------------
Reporter: robe | Owner: robe
Type: defect | Status: new
Priority: medium | Milestone: PostGIS 2.0.2
Component: tiger geocoder | Version: trunk
Keywords: |
----------------------------+-----------------------------------------------
I'm not sure if this is just something I'm missing in settings with the
new PostgreSQL or a fundamental issue with the new PostgreSQL planner.
I have 3 parallel windows PostgreSQL instances -- 8.4 32-bit running
PostGIS 1.5 (on older clunky 32-bit windows 2003 box)
on new Windows 2008 R2 same box: 9.1 64-bit running PostGIS 2.1.0 SVN, 9.2
64-bit running PostGIS 2.1.0 SVN.
On the 9.1 and 8.4, my MA data standard geocodes even with supplementary
augmented parcel check completes in about 30-200ms.
On 9.2 instance with same dataset. It takes 8-30 seconds. Yap.
I compared the plans and it seems 9.2 is not using a bitmap index scan for
this particular query I have. I have narrowed it to this construct.
{{{
-- 10 seconds
WITH a AS
( SELECT *
FROM (SELECT f.*, ad.side, ad.zip, ad.fromhn, ad.tohn,
RANK() OVER(ORDER BY
diff_zip(ad.zip,'02124') + CASE WHEN lower(f.name) = lower('Evans') THEN
0 ELSE levenshtein_ignore_case(f.name, lower('Evans') ) END +
levenshtein_ignore_case(f.fullname, lower('Evans' || ' ' ||
COALESCE('St','')) )
+ CASE WHEN
(greatest_hn(ad.fromhn,ad.tohn) % 2)::integer = ('22' % 2)::integer THEN 0
ELSE 1 END
+ CASE WHEN '22'::integer
BETWEEN least_hn(ad.fromhn,ad.tohn) AND greatest_hn(ad.fromhn, ad.tohn)
THEN 0 ELSE 4 END
+ CASE WHEN
lower('St') = lower(f.suftypabrv) OR lower('St') = lower(f.pretypabrv)
THEN 0 ELSE 1 END
+
rate_attributes(NULL, f.predirabrv, 'Evans', f.name , 'St',
suftypabrv , NULL, sufdirabrv, prequalabr)
)
As rank
FROM featnames As f INNER JOIN addr As ad
ON (f.tlid = ad.tlid)
WHERE '25' = f.statefp AND '25' = ad.statefp
AND lower(f.name) = lower('Evans') AND (
ad.zip = ANY('{02123,02124,02125}'::varchar[]) ) ) AS foo ORDER BY rank
LIMIT 3 )
SELECT a.tlid, a.fullname, a.name, a.predirabrv, a.suftypabrv,
a.sufdirabrv, a.prequalabr, a.pretypabrv,
b.the_geom, tfidr, tfidl,
a.side ,
a.fromhn,
a.tohn,
a.zip,
p.name as place
FROM a INNER JOIN edges As b ON (a.statefp = b.statefp
AND a.tlid = b.tlid )
INNER JOIN faces AS f ON ('25' = f.statefp AND (
(b.tfidl = f.tfid AND a.side = 'L') OR (b.tfidr = f.tfid AND a.side = 'R'
) ))
INNER JOIN place p ON ('25' = p.statefp AND f.placefp
= p.placefp )
WHERE a.statefp = '25' AND b.statefp = '25' ;
}}}
Which can be worked around by switching to this:
{{{
-- 92 ms --
WITH a AS
( SELECT *
FROM (SELECT f.*, ad.side, ad.zip, ad.fromhn, ad.tohn,
RANK() OVER(ORDER BY
diff_zip(ad.zip,'02124') + CASE WHEN lower(f.name) = lower('Evans') THEN
0 ELSE levenshtein_ignore_case(f.name, lower('Evans') ) END +
levenshtein_ignore_case(f.fullname, lower('Evans' || ' ' ||
COALESCE('St','')) )
+ CASE WHEN
(greatest_hn(ad.fromhn,ad.tohn) % 2)::integer = ('22' % 2)::integer THEN 0
ELSE 1 END
+ CASE WHEN '22'::integer
BETWEEN least_hn(ad.fromhn,ad.tohn) AND greatest_hn(ad.fromhn, ad.tohn)
THEN 0 ELSE 4 END
+ CASE WHEN
lower('St') = lower(f.suftypabrv) OR lower('St') = lower(f.pretypabrv)
THEN 0 ELSE 1 END
+
rate_attributes(NULL, f.predirabrv, 'Evans', f.name , 'St',
suftypabrv , NULL, sufdirabrv, prequalabr)
)
As rank
FROM featnames As f INNER JOIN addr As ad
ON (f.tlid = ad.tlid)
WHERE '25' = f.statefp AND '25' = ad.statefp
AND lower(f.name) = lower('Evans') AND (
ad.zip = ANY('{02123,02124,02125}'::varchar[]) ) ) AS foo ORDER BY rank
LIMIT 3 )
SELECT a.tlid, a.fullname, a.name, a.predirabrv, a.suftypabrv,
a.sufdirabrv, a.prequalabr, a.pretypabrv,
b.the_geom, tfidr, tfidl,
a.side ,
a.fromhn,
a.tohn,
a.zip,
p.name as place
FROM a INNER JOIN edges As b ON (a.statefp = b.statefp
AND a.tlid = b.tlid )
INNER JOIN faces AS f ON ('25' = f.statefp AND (
(b.tfidl = f.tfid ) OR (b.tfidr = f.tfid ) ))
INNER JOIN place p ON ('25' = p.statefp AND f.placefp
= p.placefp )
WHERE a.statefp = '25' AND b.statefp = '25' AND
( (b.tfidl = f.tfid AND a.side = 'L') OR (b.tfidr = f.tfid
AND a.side = 'R' ) );
}}}
I'll encode this revision if I conclude it's not just something screwy
with my setup and also report it up the chain if I conclude it is a
PostgreSQL 9.2 issue, which I suspect it is.
Note: there is nothing spatial in the above query, so I think PostGIS is
off the hook.
--
Ticket URL: <https://trac.osgeo.org/postgis/ticket/1991>
PostGIS <http://trac.osgeo.org/postgis/>
The PostGIS Trac is used for bug, enhancement & task tracking, a user and developer wiki, and a view into the subversion code repository of PostGIS project.
More information about the postgis-devel
mailing list