[postgis-users] Performance issue with ST_INTERSECTS

Martin Davis mtnclimb at telus.net
Fri Nov 4 11:03:21 PDT 2011


This should be making use of PreparedGeometry, right?  Meaning that the 
actual intersection computation should be fairly performant?  So is it 
just the fact that the index is not very selective that is causing the 
performance issue?

Just asking in case this is revealing some kind of issue with the 
PreparedGeometry implementation...

Martin

On 11/3/2011 3:26 PM, Paul Ramsey wrote:
> Just from your description, my guess is that your few, large, regions
> with many vertices are the problem. Since they are static, I'd
> recommend cutting them up into a much larger number of small regions,
> by for example intersecting them with a regular grid. Then you can get
> much more effect from your indexes.
>
> P.
>
> On Thu, Nov 3, 2011 at 2:30 PM, Shira Bezalel<shira at sfei.org>  wrote:
>> Hello List,
>>
>> This is my first post so feel free to let me know if I'm missing any critical info or if this is better suited for the pgsql-performance list.
>>
>> Essentially, I'm troubleshooting a performance issue with a spatial intersection query. The query sums the lengths of line features that intersect polygon features. The line layer is very dense. The polygon layer only has 8 features, but each polygon does have a lot of points.
>>
>> It's taking almost 50 minutes to run and I'm wondering if the planner is choosing a less than optimal query plan. One reason I'm thinking this is based on a bad row estimate in the explain analyze output. Can anything be done to resolve this bad estimate? Or is it unavoidable? And more importantly: Is the query plan that the planner is choosing the fastest one that it could use despite the bad row estimate?
>>
>> -----Version Info-----
>> Production Server:
>> PostgreSQL 8.4.7 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit
>> POSTGIS="1.5.1" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " LIBXML="2.7.6" USE_STATS
>>
>> We also tested the query against a test server with more recent postgres product versions (the same query plan and bad row estimate occurred):
>> PostgreSQL 9.1.1 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit
>> POSTGIS="1.5.3" GEOS="3.2.2-CAPI-1.6.2" PROJ="Rel. 4.7.1, 23 September 2009 " LIBXML="2.7.8" USE_STATS (procs from 1.5 r5385 need upgrade)
>>
>> -----SQL Statement-----
>> SELECT r.gid, SUM(ST_LENGTH(ST_INTERSECTION(r.the_geom,n.the_geom))) * 0.000621371192 AS milesum
>> FROM psaregions r join nhd100kstreams n ON ST_INTERSECTS(r.the_geom, n.the_geom)
>> WHERE n.fcode = 46003
>> GROUP BY r.gid
>>
>> -----Explain Analyze Output-----
>>
>> See here: http://explain.depesz.com/s/Esx
>>
>> HashAggregate (cost=509.88..512.02 rows=8 width=1195362) (actual time=2823674 .455..2823674 .458 rows=8 loops=1)
>> ->  Nested Loop (cost=0.00..70.36 rows=87904 width=1195362) (actual time=23.893..675119.593 rows=90268 loops=1)
>> Join Filter: _st_intersects(r.the_geom, n.the_geom)
>> ->  Seq Scan on psaregions r (cost=0.00..1.08 rows=8 width=1193798) (actual time=0.005..0.019 rows=8 loops=1)
>> ->  Index Scan using nhd100kstreams_the_geom_gist on nhd100kstreams n (cost=0.00..8.40 rows=1 width=1564) (actual time=0.043..214.728 rows=41631 loops=8)
>> Index Cond: (r.the_geom&&  n.the_geom)
>> Filter: (n.fcode = 46003)
>> Total runtime: 2823674.517 ms
>>
>> ***Notice actual rows in the first index scan is 41631, but the estimate is 1.
>>
>> -----Additional Info-----
>> - I've run vacuum analyze on the tables in question. That has not helped. (Autovacuum does run regularly also.)
>> - In terms of history, this query has always been slow.
>> - Various configuration settings:
>> shared_buffers: 3GB (Machine RAM: 12GB)
>> work_mem: 2GB
>> maintenance_work_mem: 1GB
>> effective_cache_size: 2GB (this was originally set to the default; we increased it to 2GB and restarted the server, but it didn't change the query plan.)
>> - Hardware info: Dell PowerEdge R710 running Ubuntu 10.04.2 LTS.
>>
>> -----Table and Index Schema-----
>>
>> Table "public.psaregions"
>> Column | Type | Modifiers
>> ------------+-----------------------+----------------------------------------------------------
>> gid | integer | not null default nextval('psaregions_gid_seq'::regclass)
>> psa_region | character varying(50) |
>> np_length | numeric |
>> xxx | numeric |
>> the_geom | geometry |
>> sm_geom | geometry |
>> miles46006 | numeric |
>> miles46003 | numeric |
>> llextent | character varying |
>> Indexes:
>> "psaregions_pkey" PRIMARY KEY, btree (gid)
>> "psaregions_region_idx" btree (psa_region)
>> "psaregions_sm_geom_gist" gist (sm_geom)
>> "psaregions_the_geom_gist" gist (the_geom)
>>
>>
>> Table "public.nhd100kstreams"
>> Column | Type | Modifiers
>> ------------+-----------------------+--------------------------------------------------------------
>> gid | integer | not null default nextval('nhd100kstreams_gid_seq'::regclass)
>> objectid | integer |
>> comid | integer |
>> fdate | date |
>> resolution | integer |
>> gnis_id | character varying(10) |
>> gnis_name | character varying(65) |
>> lengthkm | numeric |
>> reachcode | character varying(14) |
>> flowdir | integer |
>> wbareacomi | integer |
>> ftype | integer |
>> fcode | integer |
>> shape_leng | numeric |
>> enabled | smallint |
>> geo_geom | geometry |
>> sm_geom | geometry |
>> the_geom | geometry |
>> Indexes:
>> "nhd100kstreams_pkey" PRIMARY KEY, btree (gid)
>> "nhd100kstreams_fcode_idx" btree (fcode)
>> "nhd100kstreams_geo_geom_gist" gist (geo_geom)
>> "nhd100kstreams_sm_geom_gist" gist (sm_geom)
>> "nhd100kstreams_the_geom_gist" gist (the_geom)
>>
>>
>> -----Table Metadata: -----
>> psaregions: 8 records; very static, no regular updates/inserts
>> nhd100kstreams: 234954 records; 137,676 records with fcode = 46003; very static, no regular updates/inserts
>>
>>
>> Looking forward to hearing your feedback.
>>
>> Thank you kindly,
>> Shira Bezalel
>>
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.1834 / Virus Database: 2092/4593 - Release Date: 11/03/11
>
>



More information about the postgis-users mailing list