[postgis-users] Optimizer choosing slower strategy (Bitmap Heap vs Index)?

Wed Jul 26 03:31:07 PDT 2017

Hi Dan,

2017-07-25 18:47 GMT+02:00 Dan Lyke <danlyke at flutterby.com>:

> I'm trying to figure out a performance issue between two quarters of
> map data. Both are using PosgreSQL 9.5.4 on Amazon RDS. The current
> quarter's data appears to be much slower than the old quarter's data.
>
> VACUUM ANALYZE has been run.
>
> EXPLAIN ... on the old data is doing a:
>
> ->  Bitmap Heap Scan on foo_line (cost=50.85..5380.25 rows=1346
> width=306)
>     Recheck Cond: (way && '...'::geometry)
>     Filter: (highway IS NOT NULL)
>     ->  Bitmap Index Scan on foo_index  (cost=0.00..50.52 rows=1346
> width=0)
>         Index Cond: (way && '...'::geometry)
>
>
> And on the new data:
>
> ->  Index Scan using foo_index on foo_line  (cost=0.55..1346.36
> rows=332 width=304)
>     Index Cond: (way &&  '...'::geometry)
>     Filter: (highway IS NOT NULL)
>
> Which looks to me like it's doing the filter before the Bitmap Index
> Scan?
>

Actually, the planner is choosing a different strategy for query execution.

On old dataset, the index is scanned through the bitmap: there are two
phases, in the first the index is used to build a bitmap, then the query
bases to find matching entries through the bitmap itself. This strategy
is generally chosen if the index scan would involve several accesses
on same blocks of the indexes.

On new dataset, a simple index scan is executed, avoiding the bitmap
index/heap scan. This is generally faster if index blocks are singularly
accessed.

Basing on planner estimations attached in yours EXPLAIN outputs,
this could be indeed the case: on old datasets, the planner estimates
an higher numbers of involved records by the query execution through
the index scan (1346 vs. 332), and an higher cost in terms of inspected
index blocks (5380 vs. 1346), so the bitmap strategy is chosen.

> In either case, the new database is substantially slower than the old
> database, and I need to fix that, and I'm down to asking strangers on
> the Internet while I continue to Google around for solutions.

Could you provide further details in what could be changed between
old and new datasets? Is the content significantly changed (bounds of
geometries, etc.)?

Furthermore, it could be useful also to have attached the output of the
EXPLAIN ANALYSE command, in order to compare what the planner
estimates with what it actually do.

Hope this helps,
Giuseppe.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20170726/988bf65f/attachment.html>