[postgis-users] Performance issue with ST_INTERSECTS

Shira Bezalel shira at sfei.org
Fri Nov 4 15:03:49 PDT 2011


Hi Paul, 

Thank you for your suggestion. I gave it a try and now the query is running in about 22 seconds. Amazing! If you have time, can you shed some light on why this makes such a difference? What is it about the spatial indexes of smaller regions that make them more effective? Thank you again, 

Shira 

----- Original Message -----

From: "Paul Ramsey" <pramsey at opengeo.org> 
To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net> 
Sent: Thursday, November 3, 2011 3:26:24 PM 
Subject: Re: [postgis-users] Performance issue with ST_INTERSECTS 

Just from your description, my guess is that your few, large, regions 
with many vertices are the problem. Since they are static, I'd 
recommend cutting them up into a much larger number of small regions, 
by for example intersecting them with a regular grid. Then you can get 
much more effect from your indexes. 

P. 

On Thu, Nov 3, 2011 at 2:30 PM, Shira Bezalel <shira at sfei.org> wrote: 
> Hello List, 
> 
> This is my first post so feel free to let me know if I'm missing any critical info or if this is better suited for the pgsql-performance list. 
> 
> Essentially, I'm troubleshooting a performance issue with a spatial intersection query. The query sums the lengths of line features that intersect polygon features. The line layer is very dense. The polygon layer only has 8 features, but each polygon does have a lot of points. 
> 
> It's taking almost 50 minutes to run and I'm wondering if the planner is choosing a less than optimal query plan. One reason I'm thinking this is based on a bad row estimate in the explain analyze output. Can anything be done to resolve this bad estimate? Or is it unavoidable? And more importantly: Is the query plan that the planner is choosing the fastest one that it could use despite the bad row estimate? 
> 
> -----Version Info----- 
> Production Server: 
> PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit 
> POSTGIS="1.5.1" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " LIBXML="2.7.6" USE_STATS 
> 
> We also tested the query against a test server with more recent postgres product versions (the same query plan and bad row estimate occurred): 
> PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit 
> POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " LIBXML="2.7.8" USE_STATS (procs from 1.5 r5385 need upgrade) 
> 
> -----SQL Statement----- 
> SELECT r.gid, SUM(ST_LENGTH(ST_INTERSECTION(r.the_geom,n.the_geom))) * 0.000621371192 AS milesum 
> FROM psaregions r join nhd100kstreams n ON ST_INTERSECTS(r.the_geom, n.the_geom) 
> WHERE n.fcode = 46003 
> GROUP BY r.gid 
> 
> -----Explain Analyze Output----- 
> 
> See here: http://explain.depesz.com/s/Esx 
> 
> HashAggregate (cost=509.88..512.02 rows=8 width=1195362) (actual time=2823674 .455..2823674 .458 rows=8 loops=1) 
> -> Nested Loop (cost=0.00..70.36 rows=87904 width=1195362) (actual time=23.893..675119.593 rows=90268 loops=1) 
> Join Filter: _st_intersects(r.the_geom, n.the_geom) 
> -> Seq Scan on psaregions r (cost=0.00..1.08 rows=8 width=1193798) (actual time=0.005..0.019 rows=8 loops=1) 
> -> Index Scan using nhd100kstreams_the_geom_gist on nhd100kstreams n (cost=0.00..8.40 rows=1 width=1564) (actual time=0.043..214.728 rows=41631 loops=8) 
> Index Cond: (r.the_geom && n.the_geom) 
> Filter: (n.fcode = 46003) 
> Total runtime: 2823674.517 ms 
> 
> ***Notice actual rows in the first index scan is 41631, but the estimate is 1. 
> 
> -----Additional Info----- 
> - I've run vacuum analyze on the tables in question. That has not helped. (Autovacuum does run regularly also.) 
> - In terms of history, this query has always been slow. 
> - Various configuration settings: 
> shared_buffers: 3GB (Machine RAM: 12GB) 
> work_mem: 2GB 
> maintenance_work_mem: 1GB 
> effective_cache_size: 2GB (this was originally set to the default; we increased it to 2GB and restarted the server, but it didn't change the query plan.) 
> - Hardware info: Dell PowerEdge R710 running Ubuntu 10.04.2 LTS. 
> 
> -----Table and Index Schema----- 
> 
> Table "public.psaregions" 
> Column | Type | Modifiers 
> ------------+-----------------------+---------------------------------------------------------- 
> gid | integer | not null default nextval('psaregions_gid_seq'::regclass) 
> psa_region | character varying(50) | 
> np_length | numeric | 
> xxx | numeric | 
> the_geom | geometry | 
> sm_geom | geometry | 
> miles46006 | numeric | 
> miles46003 | numeric | 
> llextent | character varying | 
> Indexes: 
> "psaregions_pkey" PRIMARY KEY, btree (gid) 
> "psaregions_region_idx" btree (psa_region) 
> "psaregions_sm_geom_gist" gist (sm_geom) 
> "psaregions_the_geom_gist" gist (the_geom) 
> 
> 
> Table "public.nhd100kstreams" 
> Column | Type | Modifiers 
> ------------+-----------------------+-------------------------------------------------------------- 
> gid | integer | not null default nextval('nhd100kstreams_gid_seq'::regclass) 
> objectid | integer | 
> comid | integer | 
> fdate | date | 
> resolution | integer | 
> gnis_id | character varying(10) | 
> gnis_name | character varying(65) | 
> lengthkm | numeric | 
> reachcode | character varying(14) | 
> flowdir | integer | 
> wbareacomi | integer | 
> ftype | integer | 
> fcode | integer | 
> shape_leng | numeric | 
> enabled | smallint | 
> geo_geom | geometry | 
> sm_geom | geometry | 
> the_geom | geometry | 
> Indexes: 
> "nhd100kstreams_pkey" PRIMARY KEY, btree (gid) 
> "nhd100kstreams_fcode_idx" btree (fcode) 
> "nhd100kstreams_geo_geom_gist" gist (geo_geom) 
> "nhd100kstreams_sm_geom_gist" gist (sm_geom) 
> "nhd100kstreams_the_geom_gist" gist (the_geom) 
> 
> 
> -----Table Metadata: ----- 
> psaregions: 8 records; very static, no regular updates/inserts 
> nhd100kstreams: 234954 records; 137,676 records with fcode = 46003; very static, no regular updates/inserts 
> 
> 
> Looking forward to hearing your feedback. 
> 
> Thank you kindly, 
> Shira Bezalel 
> 
> 
> 
> _______________________________________________ 
> postgis-users mailing list 
> postgis-users at postgis.refractions.net 
> http://postgis.refractions.net/mailman/listinfo/postgis-users 
> 
_______________________________________________ 
postgis-users mailing list 
postgis-users at postgis.refractions.net 
http://postgis.refractions.net/mailman/listinfo/postgis-users 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20111104/c0274095/attachment.html>


More information about the postgis-users mailing list