[postgis-users] Delete duplicated records
Gregory Williamson
Gregory.Williamson at digitalglobe.com
Mon Jun 18 20:26:36 PDT 2007
Alvaro asked:
> Hello all.
>
> Hope somebody could help me. I have edited a huge amount of data into a
> table and suddenly I have realized that there is quite a lot of
> duplicated data (I mean duplicated entities: Linestrings) into de table.
> Does anybody could give a hint in how could I delete this duplicated
> geometries/records?.
>
> Thanks a lot.
>
If you have OIDs in the table you can use them to distinguish otherwise identical records.
If you don't have them, you might be able to use a variant of the code below, which I used recently to remove duplicate rows after someone managed to do a partial re-load ... in these cases only a small number of rows (~1-5%) were duplicates.
In this case, gid is a int key. The db in question was in light use so I put the core operations inside a transaction ... if you're the only user you might still want to do so as a safety belt, but consistancy issues with other users wouldn't be a problem.
HTH,
Greg Williamson
Senior DBA
Globexplorer, a DigitalGlobe company
-- sql for 8.1.x follows
create table foo as select g_id,count(g_id) as count from data_table group by 1 order by 2 desc,1;
delete from foo where count < 2;
select * from foo where count > 2; -- nothing we hope!
create table foowrk as select * from data_table where g_id in (select g_id from foo) order by g_id;
alter table foowrk add serid serial;
update foowrk set serid = nextval('foowrk_serid_seq');
delete from foowrk where (serid % 2) = 0;
create table data_table_bu as select * from data_table order by g_id;
alter table foowrk drop serid;
begin;
delete from data_table where g_id in (select g_id from foowrk);
insert into data_table (<your column list here>) select <your column list here> from foowrk;
reindex table data_table;
commit;
vacuum verbose analyze data_table;
drop table foo;
drop table foowrk;
-- drop the data_table_bu when all is well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20070618/4379d24e/attachment.html>
More information about the postgis-users
mailing list