[postgis-users] How could I improve this query?
Jorge Arévalo
jorge.arevalo at deimos-space.com
Wed Nov 24 10:27:21 PST 2010
On Tue, Nov 23, 2010 at 7:26 PM, Kevin Neufeld <kneufeld at refractions.net> wrote:
> Interesting thought. Note though, Benjamin, that Jorge specified "a.min",
> not "b.min" in the filter clause.
> -- Kevin
>
>
> On 11/23/2010 10:07 AM, Benjamin Juhn wrote:
>
> You're current query is likely exploding your result set when table_a
> records intersect with more than one record in table_b. Assuming there is
> no overlap of min & max values in table_b the following query will stop
> testing for intersection between a & c records after find one so it should
> run faster:
> UPDATE table_a a SET integer_field = b.integer_field
> FROM table_b b
> WHERE a.integer_field2 <= b.max AND a.integer_field2 >= b.min
> AND EXISTS (SELECT true FROM table_c c WHERE st_intersects(a.geom,
> c.geom) LIMIT 1);
> also I'm assuming you meant to reference b.min here:
>>
>> table_a.integer_field2 >=
>> table_a.min
>
> On Nov 23, 2010, at 9:48 AM, David William Bitner wrote:
>
> Jorge,
> What does running "explain analyze" give you for that query?
> db
>
Hello,
The explain analyze output for one example query with that structure is:
Nested Loop (cost=2.58..3167.07 rows=378524 width=496) (actual
time=0.789..1098006.873 rows=1165412 loops=1)
Join Filter: ((table_a.integer_field2 >= table_b.min) AND
(table_a.integer_field2 <= table_b.max) AND
_st_intersects(table_a.geom, table_c.geom))
-> Hash Join (cost=2.58..9.59 rows=104 width=10227) (actual
time=0.066..0.235 rows=104 loops=1)
Hash Cond: (table_b.string_field = table_c.string_field)
-> Seq Scan on table_b (cost=0.00..2.88 rows=88 width=63)
(actual time=0.009..0.033 rows=88 loops=1)
-> Hash (cost=2.26..2.26 rows=26 width=10266) (actual
time=0.031..0.031 rows=26 loops=1)
-> Seq Scan on table_c (cost=0.00..2.26 rows=26
width=10266) (actual time=0.006..0.014 rows=26 loops=1)
-> Index Scan using geom_idx on table_a (cost=0.00..28.76 rows=6
width=492) (actual time=0.061..61.337 rows=49146 loops=104)
Index Cond: (table_a.geom && table_c.geom)
Total runtime: 1147446.858 ms
So, the query took 19min. And maybe I'm misunderstanding the analyze
output but seems that I need an index on string fields, to avoid
sequential scan. Am I right? What's the best index type for string
fields?
Benjamin, I'm going to test your query. Many thanks. And yes, it was
b.min, instead of a.min. A typo.
--
Jorge Arévalo
Internet & Mobilty Division, DEIMOS
jorge.arevalo at deimos-space.com
http://mobility.grupodeimos.com/
http://gis4free.wordpress.com
More information about the postgis-users
mailing list