[postgis-users] Query crashed

Stephen Woodbridge woodbri at swoodbridge.com
Mon Nov 21 18:29:38 PST 2005


Mark,

Here are the EXPLAIN ANALYZE plans for the three query variation. I 
would love to have your insight on these and how you analyze them as it 
will probably help me in the future.

Thank you,
   -Steve W.

canada=# EXPLAIN ANALYZE select count(*) from roadseg r, streets s where 
r.the_geom && s.the_geom;
                                                                  QUERY 
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=70651393.10..70651393.10 rows=1 width=0) (actual 
time=5037110.049..5037110.050 rows=1 loops=1)
    ->  Nested Loop  (cost=0.00..70633249.98 rows=7257247 width=0) 
(actual time=196.615..5027698.511 rows=4795163 loops=1)
          ->  Seq Scan on roadseg r  (cost=0.00..139617.69 rows=1843769 
width=143) (actual time=19.554..13494.148 rows=1843769 loops=1)
          ->  Index Scan using streets_gidx on streets s 
(cost=0.00..38.12 rows=9 width=152) (actual time=1.321..2.705 rows=3 
loops=1843769)
                Index Cond: ("outer".the_geom && s.the_geom)
  Total runtime: 5037110.212 ms
(6 rows)

Time: 5037273.857 ms
canada=# EXPLAIN ANALYZE select count(*) from roadseg r, streets s where 
expand(r.the_geom,
canada(# 0.00007) && s.the_geom and
canada-#   within(s.the_geom, buffer(r.the_geom, 0.00007));
NOTICE:  LWGEOM_gist_joinsel called with arguments that are not column 
references
                                                                  QUERY 
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=70776133.61..70776133.61 rows=1 width=0) (actual 
time=8948375.130..8948375.131 rows=1 loops=1)
    ->  Nested Loop  (cost=0.00..70762313.81 rows=5527917 width=0) 
(actual time=27496.750..8946372.737 rows=352411 loops=1)
          Join Filter: within("inner".the_geom, buffer("outer".the_geom, 
7e-05::double precision))
          ->  Seq Scan on roadseg r  (cost=0.00..139617.69 rows=1843769 
width=143) (actual time=12.002..17298.887 rows=1843769 loops=1)
          ->  Index Scan using streets_gidx on streets s 
(cost=0.00..38.12 rows=9 width=152) (actual time=1.136..2.508 rows=3 
loops=1843769)
                Index Cond: (expand("outer".the_geom, 7e-05::double 
precision) && s.the_geom)
  Total runtime: 8948375.254 ms
(7 rows)

Time: 8948649.029 ms

canada=# EXPLAIN ANALYZE select count(*) from roadseg r, streets s where 
r.the_geom
canada-# && s.the_geom and
canada-#   within(s.the_geom, buffer(r.the_geom, 0.00007));
                                                                  QUERY 
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=70722267.29..70722267.29 rows=1 width=0) (actual 
time=8595043.383..8595043.384 rows=1 loops=1)
    ->  Nested Loop  (cost=0.00..70716219.59 rows=2419082 width=0) 
(actual time=27221.956..8593129.512 rows=337306 loops=1)
          Join Filter: within("inner".the_geom, buffer("outer".the_geom, 
7e-05::double precision))
          ->  Seq Scan on roadseg r  (cost=0.00..139617.69 rows=1843769 
width=143) (actual time=25.931..17149.285 rows=1843769 loops=1)
          ->  Index Scan using streets_gidx on streets s 
(cost=0.00..38.12 rows=9 width=152) (actual time=1.305..2.690 rows=3 
loops=1843769)
                Index Cond: ("outer".the_geom && s.the_geom)
  Total runtime: 8595043.505 ms
(7 rows)

Time: 8595044.999 ms


Mark Cave-Ayland wrote:
>>-----Original Message-----
>>From: postgis-users-bounces at postgis.refractions.net [mailto:postgis-users-
>>bounces at postgis.refractions.net] On Behalf Of Paul Ramsey
>>Sent: 20 November 2005 05:40
>>To: PostGIS Users Discussion
>>Subject: Re: [postgis-users] Query crashed
>>
>>We'll wait until Mark gets back on the list and comments, probably a
>>problem with the function(geometry) && geometry construction and the
>>index selectivity calculation.  Mark was working on spatial join
>>selectivity a couple months ago, but this might be a case he did not
>>anticipate.
>>
>>P
> 
> 
> 
> Hi Paul,
> 
> You're exactly right in that the issue is because the query uses F(geom)
> rather than just geom. The NOTICE that is emitted in this case is just to
> point out that when we asked for the cost of the F(geom), it didn't match
> any constants and so we return the default cost which is extremely low and
> looks very favourable to the planner. This wouldn't cause the query to crash
> though, so as confirmed by Steve's latest mail, it is more than likely to be
> a GEOS issue.
> 
> The reason we do this is because F(geom), where F() is any arbitrary
> function, could transform() the geometry in such a way so that it bears no
> relation to the original version in the column and hence we could only
> return the default.
> 
> The only potential speedup I can think of is for the join selectivity code
> to make a note of which functions return only linear transformations to the
> bounding box such as expand() - we could then evaluate the function and
> apply its bounding box result against the stored column histograms instead
> of simply returning the default.
> 
> However, this would need quite a bit of time doing investigative work so I'd
> want to check the Steve's queries would actually benefit from this before
> jumping into any coding. The quick way to check this would be to post back
> the EXPLAIN ANALYZE results of the following queries:
> 
> 
> Steve's working query using just the spatial indices:
> 
> select count(*) from roadseg r, streets s where r.the_geom && s.the_geom;
> 
> 
> Steve's working query:
> 
> select count(*) from roadseg r, streets s where expand(r.the_geom,
> 0.00007) && s.the_geom and
>   within(s.the_geom, buffer(r.the_geom, 0.00007));
> 
> 
> Steve's working query without expand()
> 
> select count(*) from roadseg r, streets s where r.the_geom
> && s.the_geom and
>   within(s.the_geom, buffer(r.the_geom, 0.00007));
> 
> 
> This will tell us how accurate the cost estimates are which are used to plan
> the query, and also where the majority of the time is being spent during
> query processing.
> 
> 
> Kind regards,
> 
> Mark.
> 
> ------------------------
> WebBased Ltd
> 17 Research Way
> Plymouth
> PL6 8BT
> 
> T: +44 (0)1752 797131
> F: +44 (0)1752 791023
> 
> http://www.webbased.co.uk   
> http://www.infomapper.com
> http://www.swtc.co.uk  
> 
> This email and any attachments are confidential to the intended recipient
> and may also be privileged. If you are not the intended recipient please
> delete it from your system and notify the sender. You should not copy it or
> use it for any purpose nor disclose or distribute its contents to any other
> person.
> 
> 
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> 




More information about the postgis-users mailing list