[postgis-users] Massive Performance Issues

Gregory Williamson Gregory.Williamson at digitalglobe.com
Tue Jul 17 18:40:17 PDT 2007


Also -- probably you've already done it -- make sure you've run an ANALYZE against all of these tables so the planner "knows" what is in them. If there are skewed distributions increasing the stats target for the effected tables/columns (see ALTER TABLE for details) as the default might not be sufficient.


Greg Williamson
Senior DBA
GlobeXplorer LLC, a DigitalGlobe company

Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

(My corporate masters made me say this.)

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net on behalf of Alan Cunnane
Sent: Tue 7/17/2007 7:26 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Massive Performance Issues
 
Ah I am sorry here is the eplain analyze requested. The explain analyze query is shown below:

 QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------
 Limit  (cost=150061918.96..150061918.97 rows=1 width=166) (actual time=71135.765..71135.765 rows=0 loops=1)
   ->  Sort  (cost=150061918.96..150407602.50 rows=138273415 width=166) (actual time=71135.758..71135.758 rows=0 loops=1)
         Sort Key: (e.arrival_time - g.depart_time)
         ->  Merge Join  (cost=665925.37..32125314.72 rows=138273415 width=166) (actual time=71135.734..71135.734 rows=0 loops=1)
               Merge Cond: ("outer"."?column8?" = "inner"."?column7?")
               Join Filter: (("outer".depart_time >= ("inner".depart_time + '00:03:00'::interval)) AND ("outer".depart_time <= ("inner".depart_time + '00:13:00'::interval)))
               ->  Sort  (cost=441268.22..442374.55 rows=442532 width=115) (actual time=70772.175..70772.175 rows=1 loops=1)
                     Sort Key: (c.stop_reference)::text
                     ->  Hash Join  (cost=49458.47..324978.85 rows=442532 width=115) (actual time=23959.230..55506.743 rows=194484 loops=1)
                           Hash Cond: ("outer".service_id = "inner".service_id)
                           Join Filter: ("outer".stop_order < "inner".stop_order)
                           ->  Seq Scan on service1 c  (cost=0.00..190676.16 rows=1436516 width=58) (actual time=20936.503..47681.549 rows=1436516 loops=1)
                           ->  Hash  (cost=49458.01..49458.01 rows=185 width=77) (actual time=1250.002..1250.002 rows=3168 loops=1)
                                 ->  Nested Loop  (cost=54.14..49458.01 rows=185 width=77) (actual time=352.784..1231.899 rows=3168 loops=1)
                                       Join Filter: (("outer".route_number)::text <> ("inner".route_number)::text)
                                       ->  Index Scan using routes1_pkey on routes1 h  (cost=0.00..6.00 rows=1 width=24) (actual time=47.059..47.064 rows=1 loops=1)
                                             Index Cond: (6867 = service_id)
                                       ->  Nested Loop  (cost=54.14..49449.68 rows=186 width=53) (actual time=305.686..1160.476 rows=3168 loops=1)
                                             ->  Nested Loop  (cost=54.14..48330.43 rows=186 width=29) (actual time=88.261..798.839 rows=3168 loops=1)
                                                   ->  Index Scan using stops_distance1 on bus_stops1 d  (cost=0.00..21.94 rows=2 width=13) (actual time=24.860..40.809 rows=3 lo
ops=1)
                                                         Index Cond: (east_north && '0103000020346C0000010000000500000000000000D42E1441000000001C82244100000000D42E1441000000005C
88244100000000543B1441000000005C88244100000000543B1441000000001C82244100000000D42E1441000000001C822441'::geometry)
                                                         Filter: ((east_north && '0103000020346C0000010000000500000000000000D42E1441000000001C82244100000000D42E1441000000005C882
44100000000543B1441000000005C88244100000000543B1441000000001C82244100000000D42E1441000000001C822441'::geometry) AND (distance('0101000020346C00000000000014351441000000003C852441
'::geometry, east_north) < 200::double precision))
                                                   ->  Bitmap Heap Scan on service1 e  (cost=54.14..24064.46 rows=7183 width=58) (actual time=35.794..244.812 rows=1056 loops=3)
                                                         Recheck Cond: (("outer".stop_reference)::text = (e.stop_reference)::text)
                                                         ->  Bitmap Index Scan on service1_stop_reference  (cost=0.00..54.14 rows=7183 width=0) (actual time=16.058..16.058 rows=
1056 loops=3)
                                                               Index Cond: (("outer".stop_reference)::text = (e.stop_reference)::text)
                                             ->  Index Scan using routes1_pkey on routes1 i  (cost=0.00..6.00 rows=1 width=24) (actual time=0.095..0.100 rows=1 loops=3168)
                                                   Index Cond: (i.service_id = "outer".service_id)
               ->  Sort  (cost=224657.14..226063.21 rows=562427 width=117) (actual time=363.545..363.545 rows=0 loops=1)
                     Sort Key: (b.stop_b)::text
                     ->  Merge Join  (cost=12136.48..23011.69 rows=562427 width=117) (actual time=363.523..363.523 rows=0 loops=1)
                           Merge Cond: (("outer".stop_a)::text = "inner"."?column6?")
                           ->  Index Scan using link_stop_a on stop_link b  (cost=0.00..2312.36 rows=50578 width=84) (actual time=24.077..60.377 rows=7213 loops=1)
                           ->  Sort  (cost=12136.48..12142.04 rows=2224 width=75) (actual time=271.807..271.822 rows=4 loops=1)
                                 Sort Key: (a.stop_reference)::text
                                 ->  Nested Loop  (cost=2864.85..12012.84 rows=2224 width=75) (actual time=242.716..271.730 rows=4 loops=1)
                                       ->  Nested Loop  (cost=96.53..501.68 rows=1 width=29) (actual time=147.867..147.927 rows=2 loops=1)
                                             ->  Index Scan using stops_distance1 on bus_stops1 f  (cost=0.00..21.94 rows=2 width=13) (actual time=23.857..34.496 rows=3 loops=1)
                                                   Index Cond: (east_north && '0103000020346C0000010000000500000000000000EC69134100000000588E244100000000EC6913410000000098942441
000000006C7613410000000098942441000000006C76134100000000588E244100000000EC69134100000000588E2441'::geometry)
 Filter: ((east_north && '0103000020346C0000010000000500000000000000EC69134100000000588E244100000000EC6913410000000098942441000
000006C7613410000000098942441000000006C76134100000000588E244100000000EC69134100000000588E2441'::geometry) AND (distance('0101000020346C0000000000002C7013410000000078912441'::geo
metry, east_north) < 200::double precision))
                                             ->  Bitmap Heap Scan on service1 g  (cost=96.53..239.42 rows=36 width=58) (actual time=37.784..37.787 rows=1 loops=3)
                                                   Recheck Cond: ((6867 = g.service_id) AND (("outer".stop_reference)::text = (g.stop_reference)::text))
                                                   ->  BitmapAnd  (cost=96.53..96.53 rows=36 width=0) (actual time=37.766..37.766 rows=0 loops=3)
                                                         ->  Bitmap Index Scan on service1_service_id  (cost=0.00..42.14 rows=7183 width=0) (actual time=21.129..21.129 rows=130
loops=3)
                                                               Index Cond: (6867 = service_id)
                                                         ->  Bitmap Index Scan on service1_stop_reference  (cost=0.00..54.14 rows=7183 width=0) (actual time=16.621..16.621 rows=
426 loops=3)
                                                               Index Cond: (("outer".stop_reference)::text = (g.stop_reference)::text)
                                       ->  Bitmap Heap Scan on service1 a  (cost=2768.32..11481.24 rows=2394 width=58) (actual time=61.841..61.864 rows=2 loops=2)
                                             Recheck Cond: ((a.service_id = 6867) AND ("outer".stop_order < a.stop_order))
                                             ->  BitmapAnd  (cost=2768.32..2768.32 rows=2394 width=0) (actual time=61.828..61.828 rows=0 loops=2)
                                                   ->  Bitmap Index Scan on service1_service_id  (cost=0.00..42.14 rows=7183 width=0) (actual time=0.055..0.055 rows=130 loops=2)
                                                         Index Cond: (service_id = 6867)
                                                   ->  Bitmap Index Scan on service1_stop_order  (cost=0.00..2725.94 rows=478839 width=0) (actual time=61.669..61.669 rows=79104
loops=2)
                                                         Index Cond: ("outer".stop_order < a.stop_order)
 Total runtime: 71155.967 ms
(55 rows)

At the moment it is not returning any rows, however this could be because there are no two routes which intersect at this time. That can be worked out later but the performance of the query still needs to improve. The bounding box seems to make a massive difference and I am now wondering can I use it elsewhere too. For example to create the stop_link table I need to create a table from a query which selects all stops on different routes within 200m of each other. Can I use a bounding box to do this? At the moment this is my query which takes a long time to complete:

CREATE table stops_link (stop_a, stop_b)
AS SELECT DISTINCT a.stop_reference, b.stop_reference 
FROM bus_stops1 a, bus_stops1 b, service1 c, service1 d
WHERE distance(a.east_north, b.east_north) < 200
AND expand(a.east_north, 200) && b.east_north
AND a.stop_reference = c.stop_reference
AND b.stop_reference = d.stop_reference
AND c.service_id != d.service_id
AND a.stop_reference != b.stop_reference;

Sorry if it seems im bombarding you with questions now its just you have been so helpful so far! 


----- Original Message ----
From: Stephen Frost <sfrost at snowman.net>
To: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
Sent: Wednesday, 18 July, 2007 2:04:17 AM
Subject: Re: [postgis-users] Massive Performance Issues

* Alan Cunnane (alan_cunnane at yahoo.co.uk) wrote:
> Already the bounding box seems to be making a big difference as you can see. This is th eentire EXPLAIN output:
[...]
> Can you see anymore clues in here which would help as the performance still really needs to be a lot better than this? I really appreciate your help

Can you run the query with an 'explain analyze' so we can see where the
time is actually being spent?

Is the query generating the correct result?

    Thanks,

        Stephen

> ----- Original Message ----
> From: Stephen Frost <sfrost at snowman.net>
> To: PostGIS Users Discussion <postgis-users at postgis.refractions.net>
> Sent: Wednesday, 18 July, 2007 1:33:36 AM
> Subject: Re: [postgis-users] Massive Performance Issues
> 
> * Alan Cunnane (alan_cunnane at yahoo.co.uk) wrote:
> > Hi Steven as requested here are the table definitions and constraints:
> [...]
> > Perhaps you can decipher what would be causing such a large performace problem from these constraints? Any help would be much appreciated as im at the end of my tether at this stage and really dont know what to do
> 
> Have you tried the bounding box addition I suggested?  Also, what about
> the whole explain analyze of the query?  Perferrably with the bounding
> box included?
> 
>     Thanks,
> 
>         Stephen
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 
> 
> 
> 
> 
> 
> 
>       ___________________________________________________________
> Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
> now.
> http://uk.answers.yahoo.com/ 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users







      ___________________________________________________________ 
Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
your free account today http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20070717/3fe1141c/attachment.html>


More information about the postgis-users mailing list