[postgis-users] Why does this take almost 3 hours?
Ethan Alpert
ealpert at digitalglobe.com
Mon Aug 29 13:57:05 PDT 2005
I have two shapefiles I want to interesect. I shp2pgsql'ed them, loaded
them into the database and created spatial indexes for them (vacuum
analyze'ed for good measure):
Table "public.nga_reg"
Column | Type | Modifiers
------------+-------------------+---------------------------------------
-------------------
gid | integer | not null default
nextval('public.nga_reg_gid_seq'::text)
cii_region | character varying |
the_geom | geometry |
Indexes:
"nga_reg_pkey" primary key, btree (gid)
"nga_reg_spatial" gist (the_geom)
Check constraints:
"enforce_srid_the_geom" CHECK (srid(the_geom) = 4326)
"enforce_geotype_the_geom" CHECK (geometrytype(the_geom) =
'MULTIPOLYGON'::text OR the_geom IS NULL)
Table "public.new_snaps"
Column | Type | Modifiers
------------+-------------------+---------------------------------------
---------------------
gid | integer | not null default
nextval('public.new_snaps_gid_seq'::text)
catalogid | character varying |
the_geom | geometry |
Indexes:
"new_snaps_pkey" primary key, btree (gid)
"new_snaps_spatial" gist (the_geom)
Check constraints:
"enforce_srid_the_geom" CHECK (srid(the_geom) = 4326)
"enforce_geotype_the_geom" CHECK (geometrytype(the_geom) =
'MULTIPOLYGON'::text OR the_geom IS NULL)
Then called the following query:
SELECT a.catalogid, b.cii_region, a.the_geom
FROM new_snaps a, nga_reg b
WHERE a.the_geom && b.the_geom AND intersects(a.the_geom, b.the_geom);
nga_reg is basically the world admin divided up into 11 distinct
regions. My new_snaps table has 13543 records.
2 hours and 39 minutes later pgsql2shp of the above query finished
The above query's explain analyze is:
spatialdb_sil0=> explain analyze SELECT a.catalogid, b.cii_region,
a.the_geom FROM new_snaps a, nga_reg b WHERE a.the_geom && b.the_geom
AND intersects(a.the_geom, b.the_geom);
QUERY
PLAN
------------------------------------------------------------------------
-----------------------------------------------------------------------
Nested Loop (cost=0.00..67.41 rows=4966 width=273) (actual
time=11.040..2397944.885 rows=13416 loops=1)
Join Filter: intersects("inner".the_geom, "outer".the_geom)
-> Seq Scan on nga_reg b (cost=0.00..1.11 rows=11 width=37) (actual
time=0.004..147.586 rows=11 loops=1)
-> Index Scan using new_snaps_spatial on new_snaps a
(cost=0.00..6.01 rows=1 width=268) (actual time=68.748..9599.432
rows=3892 loops=11)
Index Cond: (a.the_geom && "outer".the_geom)
Filter: (a.the_geom && "outer".the_geom)
Total runtime: 2397998.833 ms
In that 2+ hours I had time to write a script that uses mapserver's
mapscript. It pulls each of the 13500 record's geometries, with DBI, out
of postgres, converts the WKT of the geometry to a mapserver shapObj and
calls mapscripts queryByShape() method. This script runs in 2 minutes!!!
How is it possible that postgis with spatial indexes is so bad?
I surely must be doing something wrong.
-e
More information about the postgis-users
mailing list