[postgis-users] [GENERAL] How to improve this query?

Jorge Arévalo jorge.arevalo at deimos-space.com
Tue Feb 8 03:05:54 PST 2011


Hi,

2011/2/8 Stefan Keller <sfkeller at gmail.com>:
> "  Hash Cond: (TABLE_C.STR_FIELD = TABLE_B.STR_FIELD)"
> tells me that these two longish strings are compared first.
> Did you already try to index TABLE_C.STR_FIELD and TABLE_B.STR_FIELD
> as hash instead btree?

I thought hash indexes were discouraged over btree. Anyway, I'll test it.

> Did you set autovacuum off during update?

Forgot it. I'll do it

> transaction level?

I don't understand this very well...

>
> Yours, S.
>

Many thanks!

-- 
Jorge Arévalo
Internet & Mobilty Division, DEIMOS
jorge.arevalo at deimos-space.com
http://es.linkedin.com/in/jorgearevalo80
http://mobility.grupodeimos.com/
http://www.twitter.com/jorgeas80
http://gis4free.wordpress.com
http://geohash.org/ezjqgrgzz0g



> 2011/2/7 Jorge Arévalo <jorge.arevalo at deimos-space.com>:
>> Hello,
>>
>> update TABLE_A set INT_FIELD = TABLE_B.INT_FIELD from TABLE_C, TABLE_B
>> where TABLE_B.STR_FIELD = TABLE_C.STR_FIELD and TABLE_A.INT_FIELD2 >=
>> TABLE_B.min and TABLE_A.INT_FIELD2 <= TABLE_B.max and
>> st_intersects(TABLE_A.wkb_geometry, TABLE_C.the_geom);
>>
>> Tables description:
>>
>> ++++++++++++ TABLE A
>>
>> CREATE TABLE TABLE_A
>> (
>>  ogc_fid serial NOT NULL,
>>  wkb_geometry geometry,
>>  INT_FIELD2 integer,
>>  INT_FIELD integer NOT NULL DEFAULT 0,
>>  CONSTRAINT TABLE_A_pk PRIMARY KEY (ogc_fid),
>>  CONSTRAINT enforce_dims_wkb_geometry CHECK (st_ndims(wkb_geometry) = 2),
>>  CONSTRAINT enforce_srid_wkb_geometry CHECK (st_srid(wkb_geometry) = 23030)
>> )
>> WITH (
>>  OIDS=FALSE
>> );
>>
>> CREATE INDEX TABLE_A_geom_idx
>>  ON TABLE_A
>>  USING gist
>>  (wkb_geometry);
>>
>> CREATE INDEX TABLE_A_INT_FIELD2
>>  ON TABLE_A
>>  USING btree
>>  (INT_FIELD2);
>>
>> +++++++++++++++++++
>>
>>
>> ++++++++++++ TABLE B
>>
>> CREATE TABLE TABLE_B
>> (
>>  STR_FIELD character(50) DEFAULT NULL::bpchar,
>>  min integer NOT NULL DEFAULT 0,
>>  max integer NOT NULL DEFAULT 0,
>>  INT_FIELD integer NOT NULL DEFAULT 0,
>>  oid integer NOT NULL DEFAULT 0,
>>  CONSTRAINT TABLE_B_pk PRIMARY KEY (oid)
>> )
>> WITH (
>>  OIDS=FALSE
>> );
>>
>> CREATE INDEX TABLE_B_idx
>>  ON TABLE_B
>>  USING btree
>>  (STR_FIELD);
>>
>> CREATE INDEX TABLE_B_max_idx
>>  ON TABLE_B
>>  USING btree
>>  (max);
>>
>> CREATE INDEX TABLE_B_min_idx
>>  ON TABLE_B
>>  USING btree
>>  (min);
>>
>> +++++++++++++++++++
>>
>>
>>
>> ++++++++++++ TABLE C
>>
>> CREATE TABLE TABLE_C
>> (
>>  the_geom geometry,
>>  STR_FIELD character(50)
>> )
>> WITH (
>>  OIDS=FALSE
>> );
>>
>> CREATE INDEX TABLE_C_index
>>  ON TABLE_C
>>  USING gist
>>  (the_geom);
>>
>> CREATE INDEX TABLE_C_string_idx
>>  ON TABLE_C
>>  USING btree
>>  (STR_FIELD);
>>
>> +++++++++++++++++++
>>
>>
>> Tables data:
>>
>> - TABLE_A: 896888 entries. The geometries are single polygons
>> (squares, actually), coordinates are floating point numbers
>> - TABLE_B: 88 entries.
>> - TABLE C: 69352 entries. Geometries are single polygons too, but much
>> bigger than the polygons from TABLE_A.
>>
>> As you can see in the query, I'm interested in the polygons of TABLE_A
>> that intersects the big polygons in TABLE_C.
>>
>> Query plan (explain <query> output):
>>
>> "Hash Join  (cost=3.98..986808.75 rows=209049 width=497)"
>> "  Hash Cond: (TABLE_C.STR_FIELD = TABLE_B.STR_FIELD)"
>> "  Join Filter: ((TABLE_A.INT_FIELD2 >= TABLE_B.min) AND
>> (TABLE_A.INT_FIELD2 <= TABLE_B.max))"
>> "  ->  Nested Loop  (cost=0.00..955055.47 rows=470360 width=543)"
>> "        Join Filter: _st_intersects(TABLE_A.wkb_geometry, TABLE_C.the_geom)"
>> "        ->  Index Scan using TABLE_C_string_idx on TABLE_C
>> (cost=0.00..8044.56 rows=69352 width=1517)"
>> "        ->  Index Scan using TABLE_A_geom_idx on TABLE_A
>> (cost=0.00..12.61 rows=4 width=493)"
>> "              Index Cond: (TABLE_A.wkb_geometry && TABLE_C.the_geom)"
>> "  ->  Hash  (cost=2.88..2.88 rows=88 width=63)"
>> "        ->  Seq Scan on TABLE_B  (cost=0.00..2.88 rows=88 width=63)"
>>
>>
>> With that information, how could I make the update faster? It takes
>> more than 24hours to finish.
>>
>>
>> Many thanks in advance
>>
>>
>> Best regards,
>>
>> --
>> Jorge Arévalo
>> Internet & Mobilty Division, DEIMOS
>> jorge.arevalo at deimos-space.com
>> http://es.linkedin.com/in/jorgearevalo80
>> http://mobility.grupodeimos.com/
>> http://www.twitter.com/jorgeas80
>> http://gis4free.wordpress.com
>> http://geohash.org/ezjqgrgzz0g
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general at postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>



More information about the postgis-users mailing list