[postgis-users] Help: Performance of spatial query
Jeremy Palmer
jpalmer at paradise.net.nz
Mon Mar 15 01:03:51 PST 2004
Hi,
I have been trying to complete a query that tries to locate duplicate points
i.e. points that have been accidentally created twice or more These points
may not exactly be in the same geometric position. My basic methodology is
to complete a 0.2m proximity check for each geometry point in the database.
The datum of the geometries is in NZGD2000 (basically the same parameters as
WGS84).
To do this I was using the buffer function with the overlaps operator to
find points within 0.000002 of a degree (~ 0.2m) of each other. I was also
trying to stop the point reporting on itself by checking that NOD1.id <
NOD2.id. However as explain query shows below my method produces a Cartesian
product between the 2 aliases of the node table.
Can anyone help me with constructing a more efficient query? I have provided
the query explain and table details if that helps.
SELECT NOD1.id as nod_id1,
NOD2.id as nod_id2,
NOD1.shape as shape1,
NOD2.shape as shape2
FROM node NOD1,
node NOD2
WHERE NOD2.shape && buffer(NOD1.shape, 0.000002)
AND NOD1.id < NOD2.id;
Explain Query was
Nested Loop (cost=590312.88..5062636139085.67 rows=3509048593179
width=288)
Join Filter: (("inner".shape && buffer("outer".shape, 2e-06::double
precision)) AND ("outer".id < "inner".id))
-> Seq Scan on crs_node nod1 (cost=0.00..379897.88 rows=10260188
width=144)
-> Materialize (cost=590312.88..903329.76 rows=10260188 width=144)
-> Seq Scan on crs_node nod2 (cost=0.00..379897.88 rows=10260188
width=144)
Table "public. node"
Column | Type | Modifiers
-----------------+----------------------+-----------
id | integer | not null
cos_id_official | integer | not null
type | character varying(4) | not null
status | character varying(4) | not null
order_group_off | integer | not null
sit_id | integer |
alt_id | integer |
wrk_id_created | integer |
se_row_id | integer |
audit_id | integer | not null
shape | geometry |
Indexes:
"pkey_node" primary key, btree (id)
"nod_aud_id" unique, btree (audit_id)
"fk_nod_alt" btree (alt_id)
"fk_nod_cos" btree (cos_id_official)
"fk_nod_sit" btree (sit_id)
"fk_nod_wrk" btree (wrk_id_created)
"nod_shape_index" gist (shape)
Check constraints:
"$1" CHECK (srid(shape) = 104108)
"$2" CHECK (geometrytype(shape) = 'POINT'::text OR shape IS NULL)
Thank you very much
Jeremy Palmer
More information about the postgis-users
mailing list