<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.16735" name=GENERATOR></HEAD>
<BODY>
<DIV><SPAN class=578214619-17112008><FONT face=Arial color=#0000ff size=2>The
girth of my experience is in MSSQL where you can do "DELETE table1 from mytable
table1 join mytable table2...." this syntax pukes in PG as it probably should.
So I took the shortcut. Thanks for the tip.</FONT></SPAN><SPAN
class=578214619-17112008></SPAN><SPAN class=578214619-17112008></SPAN><SPAN
class=578214619-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN></DIV>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV></DIV>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT
face=Tahoma size=2>-----Original Message-----<BR><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Obe, Regina<BR><B>Sent:</B> Monday, November 17, 2008 9:21
AM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Subject:</B> RE:
[postgis-users] Newbie question - remove duplicate/identicalfeature in
postgis<BR><BR></FONT></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Stan,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>This looks good. I would add a couple of
suggestions</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>1) ST_Equals doesn't include an && check. I
know it seems like an oversight to me and I think Kevin even mentioned it and
possibly fixed it. Might be changed in 1.4 (and
maybe 1.3.4)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>2) With the self-join you have you lose the penalty
of of the correlated subquery, but sometimes a correlated subquery is faster
and I think this might be one of those cases especially if your subquery
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>will return few records per record.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>I think you can simply do this</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2><SPAN class=895400517-17112008>DELETE FROM
mytable</SPAN></FONT></SPAN></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN
class=895400517-17112008>mytable.</SPAN>myidcolumn <SPAN
class=895400517-17112008><</SPAN></SPAN><SPAN
class=765451616-17112008> </SPAN></FONT></FONT></FONT></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN class=895400517-17112008>(</SPAN><SPAN
class=895400517-17112008>SELECT </SPAN> <SPAN
class=895400517-17112008>MAX(</SPAN><SPAN
class=895400517-17112008>m</SPAN>t1.myidcolumn <SPAN
class=895400517-17112008>)</SPAN></SPAN></FONT></FONT></FONT></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> <SPAN class=895400517-17112008>FROM</SPAN>
mytable mt1</FONT></SPAN><SPAN class=765451616-17112008><FONT face=Arial
color=#0000ff size=2> </FONT></SPAN></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN
class=895400517-17112008> </SPAN><SPAN
class=895400517-17112008>WHERE</SPAN> <SPAN
class=895400517-17112008>mt1.the_geom &&
mytable.the_geom </SPAN></SPAN></FONT></FONT></FONT></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN
class=895400517-17112008> AND
</SPAN><SPAN class=895400517-17112008>ST_Equals(mytable.the_geom,
mt1.the_geom)</SPAN></SPAN></FONT></FONT></FONT><SPAN
class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>)</FONT></SPAN></DIV></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Also ST_OrderingEquals might be faster than ST_Equals
since I think ST_Equals does more intensive testing and also does bizarre
things with invalid geometries. ST_OrderingEquals is more like a binary
check, though not an absolute. I guess it depends on how equal you are
looking for.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Hope that helps,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Regina</FONT></SPAN></DIV><FONT face=Arial color=#0000ff
size=2></FONT><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Sufficool, Stanley<BR><B>Sent:</B> Monday, November 17, 2008 11:29
AM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Subject:</B> RE:
[postgis-users] Newbie question - remove duplicate /identicalfeature in
postgis<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>Sound like this was never definitively answered, so here's my 2
cents.</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>1)
Create a unique field on your table AFTER importing with
shp2pgsql.</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>alter table mytable add myidcolumn serial</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN><SPAN class=765451616-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>2)
</FONT></SPAN>Then use the st_equals function to get rid of duplicates with
a lower serial.</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>delete from mytable </FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>where myidcolumn IN (</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> select t1.myidcolumn </FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> from mytable mt1, mytable mt2</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> where st_equals(mt1.the_geom,
mt2.the_geom)</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> and mt1.myidcolumn <
mt2.myidcolumn</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>)</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>Make
sure you have an index on the geom column, or this may take a long
time.</FONT></SPAN></DIV>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV></DIV>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT
face=Tahoma size=2>-----Original Message-----<BR><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Intengu Technologies<BR><B>Sent:</B> Friday, November 14, 2008 8:00
AM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Subject:</B> Re:
[postgis-users] Newbie question - remove duplicate / identicalfeature in
postgis<BR><BR></FONT></DIV>Dear Kevin & Jean<BR><BR>Thanks for the
assistance.<BR><BR>
<DIV class=gmail_quote>2008/11/13 Kevin Neufeld <SPAN dir=ltr><<A
href="mailto:kneufeld@refractions.net"
target=_blank>kneufeld@refractions.net</A>></SPAN><BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">Unfortunately
Jean, this won't work. Your unique constraint on the geometry column
is using the geometry equals operator (=), which compares the bounding
boxes of the geometries, *not* the geometries
themselves.<BR><BR>I.E.<BR>postgis=# select 'LINESTRING(0 0, 1
1)'::geometry = 'LINESTRING(1 1, 0
0)'::geometry;<BR> ?column?<BR>----------<BR> t<BR>(1
row)<BR><BR><BR>postgis=# create temp table tmp (geom geometry);<BR>CREATE
TABLE<BR><BR>postgis=# alter table tmp add constraint unique_geom unique
(geom);<BR>NOTICE: ALTER TABLE / ADD UNIQUE will create implicit
index "unique_geom" for table "tmp"<BR>ALTER TABLE<BR><BR>postgis=# \d
tmp<BR> Table "pg_temp_2.tmp"<BR> Column | Type
| Modifiers<BR>--------+----------+-----------<BR> geom
| geometry |<BR>Indexes:<BR> "unique_geom" UNIQUE, btree
(geom)<BR><BR>postgis=# insert into tmp values ('LINESTRING(0 0, 1
1)'::geometry);<BR>INSERT 0 1<BR><BR>postgis=# insert into tmp values
('LINESTRING(1 1, 0 0)'::geometry);<BR>ERROR: duplicate key value
violates unique constraint "unique_geom"<BR><BR><BR>Sindile ,<BR>An
alternative way would be to load the data into PostGIS using the shape
dump as usual. If you have a distinct key (other than the gid which
is added by shp2pgsql) on your features, you could create a new table,
selecting from your postgis table using DISTINCT ON
(my_distinct_feature_key).<BR><A
href="http://www.postgresql.org/docs/8.3/static/sql-select.html#SQL-DISTINCT"
target=_blank>http://www.postgresql.org/docs/8.3/static/sql-select.html#SQL-DISTINCT</A><BR><BR>Or,
you could try:<BR>CREATE TABLE new_distinct_table AS<BR>SELECT att1, att2,
geom::geometry<BR>FROM (<BR> SELECT DISTINCT att1, att2, encode(geom,
'hex') as geom<BR> FROM my_polygon_table ) AS foo<BR><BR>Or, use
Jean's method, but instead of the unique constraint on the geom, use a
unique functional index using the geometry's hex string.<BR><BR>postgis=#
CREATE UNIQUE INDEX tmp_geom_unq ON tmp (encode(geom, 'hex'));<BR>CREATE
INDEX<BR><BR>postgis=# insert into tmp values ('LINESTRING(0 0, 1
1)'::geometry);<BR>INSERT 0 1<BR><BR>postgis=# insert into tmp values
('LINESTRING(1 1, 0 0)'::geometry);<BR>INSERT 0 1<BR><BR>postgis=# insert
into tmp values ('LINESTRING(1 1, 0 0)'::geometry);<BR>ERROR:
duplicate key value violates unique constraint
"tmp_geom_unq"<BR><BR>postgis=# \d tmp<BR> Table
"pg_temp_2.tmp"<BR> Column | Type |
Modifiers<BR>--------+----------+-----------<BR> geom |
geometry |<BR>Indexes:<BR> "tmp_geom_unq" UNIQUE, btree
(encode(geom::bytea, 'hex'::text))<BR><BR><BR>Remember though, that this
is *not* a spatial geometry index, just a unique index on the hex
string.<BR><BR>Hope that helps,<BR>Cheers,<BR><FONT
color=#888888>Kevin</FONT>
<DIV>
<DIV></DIV>
<DIV><BR><BR><BR><BR>Jean David TECHER wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">Example:<BR><BR>dbname
= routing_db<BR><BR>table =wr<BR><BR>1. Load only the table structure
not the datas<BR><BR>use -p option taken from shp2pgsql<BR><BR>shp2pgsql
-p wr.shp wr|psql routing_db<BR><BR>2. Add a constraint on the_geom
column<BR><BR>psql routing_db<BR><BR>ALTER TABLE wr ADD CONSTRAINT
check_unique_the_geom unique(the_geom);<BR><BR>3. Load your datas
without 'BEGIN' and 'END'<BR><BR>* Here use the -a option taken from
shp2pgsql<BR>* use -I for index creation<BR><BR>The idea is to use the
INSERT command in order to rollbacking if the<BR>current feature is
already in the table<BR><BR>N.B: don't use -D option!<BR><BR>shp2pgsql
-aI wr.shp wr|grep -v ^BEGIN|grep -v ^END|psql
routing_db<BR><BR><BR>Quoting Intengu Technologies <<A
href="mailto:sindile.bidla@gmail.com"
target=_blank>sindile.bidla@gmail.com</A>>:<BR><BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">I
have a polygon shapefile that has duplicate features which I want to
load<BR>into Postgis. How can I ensure that each feature has
only one record.<BR><BR>-- <BR>Sindile
Bidla<BR><BR></BLOCKQUOTE><BR><BR><BR>===================<BR>Jean David
TECHER<BR><A href="http://www.davidgis.fr"
target=_blank>www.davidgis.fr</A><BR>06 60 46 85 05<BR>04 99 77 16
87<BR>===================<BR><BR>_______________________________________________<BR>postgis-users
mailing list<BR><A href="mailto:postgis-users@postgis.refractions.net"
target=_blank>postgis-users@postgis.refractions.net</A><BR><A
href="http://postgis.refractions.net/mailman/listinfo/postgis-users"
target=_blank>http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR></BLOCKQUOTE>_______________________________________________<BR>postgis-users
mailing list<BR><A href="mailto:postgis-users@postgis.refractions.net"
target=_blank>postgis-users@postgis.refractions.net</A><BR><A
href="http://postgis.refractions.net/mailman/listinfo/postgis-users"
target=_blank>http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR></DIV></DIV></BLOCKQUOTE></DIV><BR><BR
clear=all><BR>-- <BR>Sindile Bidla<BR></BLOCKQUOTE>
<P>
<HR SIZE=1>
<P></P>
<P><STRONG>The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure pursuant to
Massachusetts law. It is intended solely for the addressee. If you received
this in error, please contact the sender and delete the material from any
computer. </STRONG></P>
<P>
<HR SIZE=1>
<P></P>
<P><STRONG><FONT color=#339900 size=2>Help make the earth a greener place. If
at all possible resist printing this email and join us in saving paper. </P>
<P></FONT></STRONG></P></BLOCKQUOTE></BODY></HTML>