Dear All,<br><br>I'm hoping someone can help me with some performance issues I'm having related to the use of an INDEX on my geometry column. <br><br>I'm using PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3, 64-bit and PostGIS 1.5<br>
<br>My query is joining the areas table ( ~1000 polygons ) in this query looking for data points contained within 35 of them to the data table ( ~55 million data points in the table ) with this example query only returning 767 of them. <br>
<br>Query and its explain result working well<br><br>select * FROM data, area WHERE st_intersects ( data.point, area.polygon) AND <br>(readingdatetime BETWEEN '2009-01-01' AND '2012-01-19') AND datasetid IN(1,3) AND<br>
"polysetID"
= 1 AND <a href="http://area.id">area.id</a>
IN(28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11)
<br><br>Nested Loop (cost=0.00..791.38 rows=20 width=2244) (actual time=2149.383..26059.562 rows=767 loops=1)<br> Join Filter: _st_intersects(data.point, area.polygon)<br>
-> Index Scan using "area_polysetID_index" on area
(cost=0.00..16.56 rows=1 width=1947) (actual time=0.017..0.193 rows=35
loops=1)<br> Index Cond: ("polysetID" = 1)<br> Filter: (id
= ANY
('{28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11}'::integer[]))<br>
-> Index Scan using data_point_index on data (cost=0.00..774.57
rows=1 width=297) (actual time=588.910..744.472 rows=33 loops=35)<br> Index Cond: (point && area.polygon)<br>
Filter: ((readingdatetime >= '2009-01-01 00:00:00'::timestamp
without time zone) AND (readingdatetime <= '2012-01-19
00:00:00'::timestamp without time zone) AND (datasetid = ANY
('{1,3}'::integer[])))<br>Total runtime: 26059.677 ms<br><br>as you can see takes about 26 seconds<br><br>Sometimes the query will take a lot longer and get a result like this. Normally at this point I have to VACCUM ANALYSE the table before it will behave properly again<br>
<br>select * FROM data, area WHERE st_intersects ( data.point, area.polygon) AND <br>(readingdatetime BETWEEN '2009-01-01' AND '2012-01-19') AND datasetid IN(1,3) AND<br> "polysetID"
= 1 AND <a href="http://area.id">area.id</a>
IN(28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11)
<br><br>Nested Loop (cost=0.00..1153.60 rows=35 width=2246) (actual time=86835.883..803363.979 rows=767 loops=1)<br> Join Filter: _st_intersects(data.point, area.polygon)<br>
-> Index Scan using "area_polysetID_index" on area
(cost=0.00..20.04 rows=1 width=1949) (actual time=0.021..16.287 rows=35
loops=1)<br> Index Cond: ("polysetID" = 1)<br> Filter: (id =
ANY
('{28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11}'::integer[]))<br>
-> Index Scan using data_point_index on data (cost=0.00..1133.30
rows=1 width=297) (actual time=17202.126..22952.706 rows=33 loops=35)<br> Index Cond: (point && area.polygon)<br>
Filter: ((readingdatetime >= '2009-01-01 00:00:00'::timestamp
without time zone) AND (readingdatetime <= '2012-01-19
00:00:00'::timestamp without time zone) AND (datasetid = ANY
('{1,3}'::integer[])))<br>Total runtime: 803364.120 ms<br><br>Very very slow as you can see taking nearly 14 minutes<br><br>Exact copy of the table with the point INDEX remove. <br><br>select * FROM data_new2, area WHERE st_intersects ( data_new2.point, area.polygon) AND <br>
(readingdatetime BETWEEN '2009-01-01' AND '2012-01-19') AND datasetid IN(1,3) AND<br> "polysetID" = 1 AND <a href="http://area.id">area.id</a> IN(28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11) <br>
<br>Nested Loop (cost=2576.91..284972.54 rows=34 width=2246) (actual time=126.977..181.832 rows=767 loops=1)<br> Join Filter: ((data_new2.point && area.polygon) AND _st_intersects(data_new2.point, area.polygon))<br>
-> Index Scan using "area_polysetID_index" on area (cost=0.00..20.04 rows=1 width=1949) (actual time=0.059..0.111 rows=35 loops=1)<br> Index Cond: ("polysetID" = 1)<br> Filter: (id = ANY ('{28,29,30,31,32,33,25,26,27,18,19,20,21,12,13,14,15,16,17,34,35,1,2,3,4,5,6,22,23,24,7,8,9,10,11}'::integer[]))<br>
-> Bitmap Heap Scan on data_new2 (cost=2576.91..261072.36 rows=90972 width=297) (actual time=2.531..4.057 rows=2247 loops=35)<br> Recheck Cond: ((datasetid = ANY ('{1,3}'::integer[])) AND (readingdatetime >= '2009-01-01 00:00:00'::timestamp without time zone) AND (readingdatetime <= '2012-01-19 00:00:00'::timestamp without time zone))<br>
-> Bitmap Index Scan on "data_new2_datasetid_readingDatetime_index" (cost=0.00..2554.16 rows=90972 width=0) (actual time=2.089..2.089 rows=2247 loops=35)<br> Index Cond: ((datasetid = ANY ('{1,3}'::integer[])) AND (readingdatetime >= '2009-01-01 00:00:00'::timestamp without time zone) AND (readingdatetime <= '2012-01-19 00:00:00'::timestamp without time zone))<br>
Total runtime: 181.930 ms<br><br>Really fast in this case under a second.<br><br>As you can see removing the INDEX seems to make it a lot faster but my question is why? Also why does it sometimes slow down till I do a VACCUM I've not changed the database in anyway its only been READ from. <br>
<br>Would be really grateful for any input if any more information is required about the data or database structure please let me know. <br><br>Thanks, <br><br>Mark Davidson<br>