[postgis-users] Why does this take almost 3 hours?

Ethan Alpert ealpert at digitalglobe.com
Mon Aug 29 14:14:20 PDT 2005


Bill yes my nga_reg is 11 complex shapes. 

What I don't understand is why queryByShape() using QIX files is SO much
better.

-e

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Bill
Binko
Sent: Monday, August 29, 2005 3:22 PM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Why does this take almost 3 hours?


On Mon, 29 Aug 2005, Ethan Alpert wrote:
...
> 
> nga_reg is basically the world admin divided up into 11 distinct 
> regions. My new_snaps table has 13543 records.
> 
> 2 hours and 39 minutes later pgsql2shp of the above query finished
> 
> The above query's explain analyze is:
> 
> spatialdb_sil0=> explain analyze  SELECT a.catalogid, b.cii_region, 
> a.the_geom  FROM new_snaps a, nga_reg b WHERE a.the_geom && b.the_geom

> AND intersects(a.the_geom, b.the_geom);
>
QUERY
> PLAN

> ----------------------------------------------------------------------
> --
>
-----------------------------------------------------------------------
>  Nested Loop  (cost=0.00..67.41 rows=4966 width=273) (actual
> time=11.040..2397944.885 rows=13416 loops=1)
>    Join Filter: intersects("inner".the_geom, "outer".the_geom)
>    ->  Seq Scan on nga_reg b  (cost=0.00..1.11 rows=11 width=37)
(actual
> time=0.004..147.586 rows=11 loops=1)
>    ->  Index Scan using new_snaps_spatial on new_snaps a
> (cost=0.00..6.01 rows=1 width=268) (actual time=68.748..9599.432
> rows=3892 loops=11)
>          Index Cond: (a.the_geom && "outer".the_geom)
>          Filter: (a.the_geom && "outer".the_geom)
>  Total runtime: 2397998.833 ms
> 

How many shapes are in nga_reg?  When you say "basically the world admin

divided into 11 districts", do you mean there are only 11 shapes?  If
so, 
how complex are the shapes?

There are two possibilities I can see where PostGIS would fail you in
this 
way.

One is that the nga_reg table is huge (lots of shapes). If that's the 
case, then PostGIS should be driving the other direction (doing a seq
scan 
on new_snaps and using the index on nga_reg).  If it's not, it is a poor

query plan and perhaps the PostGIS developers can help track that down.

An easy way to test this is to remove the index on new_snaps and redo
your 
explain (and your timing test).

The alternative is that you have 11 (or some other small number of) very

large, very complicated shapes.  If that's the case, your indexes are
not 
going to help you since you'll end up doing an expensive intersect() 
operation for each pair, and most new_snaps's bouding boxes will fall 
within the large shape's bounding boxes.  If the shapes are Collections,

you might try "exploding" them so that you can get better use of the 
bounding boxes.

I hope this helps
Bill
_______________________________________________
postgis-users mailing list postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list