[postgis-users] Help: Performance of spatial query
Jeremy Palmer
jpalmer at paradise.net.nz
Mon Mar 15 01:03:51 PST 2004
Hi,
I have been trying to complete a query that tries to locate duplicate points
i.e. points that have been accidentally created twice or more – These points
may not exactly be in the same geometric position. My basic methodology is
to complete a 0.2m proximity check for each geometry point in the database.
The datum of the geometries is in NZGD2000 (basically the same parameters as
WGS84).
To do this I was using the buffer function with the overlaps operator to
find points within 0.000002 of a degree (~ 0.2m) of each other. I was also
trying to stop the point reporting on itself by checking that NOD1.id <
NOD2.id. However as explain query shows below my method produces a Cartesian
product between the 2 aliases of the node table.
Can anyone help me with constructing a more efficient query? I have provided
the query explain and table details if that helps.
SELECT NOD1.id as nod_id1,
NOD2.id as nod_id2,
NOD1.shape as shape1,
NOD2.shape as shape2
FROM node NOD1,
node NOD2
WHERE NOD2.shape && buffer(NOD1.shape, 0.000002)
AND NOD1.id < NOD2.id;
Explain Query was
Nested Loop (cost=590312.88..5062636139085.67 rows=3509048593179
width=288)
Join Filter: (("inner".shape && buffer("outer".shape, 2e-06::double
precision)) AND ("outer".id < "inner".id))
-> Seq Scan on crs_node nod1 (cost=0.00..379897.88 rows=10260188
width=144)
-> Materialize (cost=590312.88..903329.76 rows=10260188 width=144)
-> Seq Scan on crs_node nod2 (cost=0.00..379897.88 rows=10260188
width=144)
Table "public. node"
Column | Type | Modifiers
-----------------+----------------------+-----------
id | integer | not null
cos_id_official | integer | not null
type | character varying(4) | not null
status | character varying(4) | not null
order_group_off | integer | not null
sit_id | integer |
alt_id | integer |
wrk_id_created | integer |
se_row_id | integer |
audit_id | integer | not null
shape | geometry |
Indexes:
"pkey_node" primary key, btree (id)
"nod_aud_id" unique, btree (audit_id)
"fk_nod_alt" btree (alt_id)
"fk_nod_cos" btree (cos_id_official)
"fk_nod_sit" btree (sit_id)
"fk_nod_wrk" btree (wrk_id_created)
"nod_shape_index" gist (shape)
Check constraints:
"$1" CHECK (srid(shape) = 104108)
"$2" CHECK (geometrytype(shape) = 'POINT'::text OR shape IS NULL)
Thank you very much
Jeremy Palmer
More information about the postgis-users
mailing list