[postgis-users] Delete duplicated records
ValiSystem
vali.system at free.fr
Tue Jun 19 02:00:43 PDT 2007
On 19 juin 07, at 03:58, El Cuy Volador wrote:
> Hello all.
>
> Hope somebody could help me. I have edited a huge amount of data
> into a
> table and suddenly I have realized that there is quite a lot of
> duplicated data (I mean duplicated entities: Linestrings) into de
> table.
> Does anybody could give a hint in how could I delete this duplicated
> geometries/records?.
>
> Thanks a lot.
>
>
> Alvaro
>
Alvaro,
I assume that you only have the geometry as duplicate value in rows,
so that each row as a different id. I had (as many people here i
suppose) this problem too, and i used this SQL query to find the
duplicate geometries :
SELECT a.id, b.id FROM buildings a, buildings b
WHERE a.geom ~= b.geom AND b.id != a.id AND a.id > b.id;
Since you use the spatial index, it does not take too long too run.
Then you can do the deletion :
DELETE FROM buildings WHERE id IN (SELECT b.id FROM buildings a,
buildings b
WHERE a.geom ~= b.geom AND a.id > b.id);
since you use a.id > b.id one occurrence of duplicate entries will
never appear in b.id, so that you just keep one row.
Run it on a test case, because i'm writing up from my memory, so
there may be a mistake :)
Finally, if you really have full duplicates, (all attributes are
equal) create a temporary table use distinct :
INSERT INTO temptable SELECT DISTINCT * FROM buildings;
Regards
More information about the postgis-users
mailing list