[postgis-users] [GENERAL] How to improve this query?

Mon Feb 7 15:19:24 PST 2011

"  Hash Cond: (TABLE_C.STR_FIELD = TABLE_B.STR_FIELD)"
tells me that these two longish strings are compared first.
Did you already try to index TABLE_C.STR_FIELD and TABLE_B.STR_FIELD
as hash instead btree?
Did you set autovacuum off during update? transaction level?

Yours, S.

2011/2/7 Jorge Arévalo <jorge.arevalo at deimos-space.com>:
> Hello,
>
> update TABLE_A set INT_FIELD = TABLE_B.INT_FIELD from TABLE_C, TABLE_B
> where TABLE_B.STR_FIELD = TABLE_C.STR_FIELD and TABLE_A.INT_FIELD2 >=
> TABLE_B.min and TABLE_A.INT_FIELD2 <= TABLE_B.max and
> st_intersects(TABLE_A.wkb_geometry, TABLE_C.the_geom);
>
> Tables description:
>
> ++++++++++++ TABLE A
>
> CREATE TABLE TABLE_A
> (
>  ogc_fid serial NOT NULL,
>  wkb_geometry geometry,
>  INT_FIELD2 integer,
>  INT_FIELD integer NOT NULL DEFAULT 0,
>  CONSTRAINT TABLE_A_pk PRIMARY KEY (ogc_fid),
>  CONSTRAINT enforce_dims_wkb_geometry CHECK (st_ndims(wkb_geometry) = 2),
>  CONSTRAINT enforce_srid_wkb_geometry CHECK (st_srid(wkb_geometry) = 23030)
> )
> WITH (
>  OIDS=FALSE
> );
>
> CREATE INDEX TABLE_A_geom_idx
>  ON TABLE_A
>  USING gist
>  (wkb_geometry);
>
> CREATE INDEX TABLE_A_INT_FIELD2
>  ON TABLE_A
>  USING btree
>  (INT_FIELD2);
>
> +++++++++++++++++++
>
>
> ++++++++++++ TABLE B
>
> CREATE TABLE TABLE_B
> (
>  STR_FIELD character(50) DEFAULT NULL::bpchar,
>  min integer NOT NULL DEFAULT 0,
>  max integer NOT NULL DEFAULT 0,
>  INT_FIELD integer NOT NULL DEFAULT 0,
>  oid integer NOT NULL DEFAULT 0,
>  CONSTRAINT TABLE_B_pk PRIMARY KEY (oid)
> )
> WITH (
>  OIDS=FALSE
> );
>
> CREATE INDEX TABLE_B_idx
>  ON TABLE_B
>  USING btree
>  (STR_FIELD);
>
> CREATE INDEX TABLE_B_max_idx
>  ON TABLE_B
>  USING btree
>  (max);
>
> CREATE INDEX TABLE_B_min_idx
>  ON TABLE_B
>  USING btree
>  (min);
>
> +++++++++++++++++++
>
>
>
> ++++++++++++ TABLE C
>
> CREATE TABLE TABLE_C
> (
>  the_geom geometry,
>  STR_FIELD character(50)
> )
> WITH (
>  OIDS=FALSE
> );
>
> CREATE INDEX TABLE_C_index
>  ON TABLE_C
>  USING gist
>  (the_geom);
>
> CREATE INDEX TABLE_C_string_idx
>  ON TABLE_C
>  USING btree
>  (STR_FIELD);
>
> +++++++++++++++++++
>
>
> Tables data:
>
> - TABLE_A: 896888 entries. The geometries are single polygons
> (squares, actually), coordinates are floating point numbers
> - TABLE_B: 88 entries.
> - TABLE C: 69352 entries. Geometries are single polygons too, but much
> bigger than the polygons from TABLE_A.
>
> As you can see in the query, I'm interested in the polygons of TABLE_A
> that intersects the big polygons in TABLE_C.
>
> Query plan (explain <query> output):
>
> "Hash Join  (cost=3.98..986808.75 rows=209049 width=497)"
> "  Hash Cond: (TABLE_C.STR_FIELD = TABLE_B.STR_FIELD)"
> "  Join Filter: ((TABLE_A.INT_FIELD2 >= TABLE_B.min) AND
> (TABLE_A.INT_FIELD2 <= TABLE_B.max))"
> "  ->  Nested Loop  (cost=0.00..955055.47 rows=470360 width=543)"
> "        Join Filter: _st_intersects(TABLE_A.wkb_geometry, TABLE_C.the_geom)"
> "        ->  Index Scan using TABLE_C_string_idx on TABLE_C
> (cost=0.00..8044.56 rows=69352 width=1517)"
> "        ->  Index Scan using TABLE_A_geom_idx on TABLE_A
> (cost=0.00..12.61 rows=4 width=493)"
> "              Index Cond: (TABLE_A.wkb_geometry && TABLE_C.the_geom)"
> "  ->  Hash  (cost=2.88..2.88 rows=88 width=63)"
> "        ->  Seq Scan on TABLE_B  (cost=0.00..2.88 rows=88 width=63)"
>
>
> With that information, how could I make the update faster? It takes
> more than 24hours to finish.
>
>
> Many thanks in advance
>
>
> Best regards,
>
> --
> Jorge Arévalo
> Internet & Mobilty Division, DEIMOS
> jorge.arevalo at deimos-space.com
> http://es.linkedin.com/in/jorgearevalo80
> http://mobility.grupodeimos.com/
> http://www.twitter.com/jorgeas80
> http://gis4free.wordpress.com
> http://geohash.org/ezjqgrgzz0g
>
> --
> Sent via pgsql-general mailing list (pgsql-general at postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>