<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.16735" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Stan,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>This looks good. I would add a couple of
suggestions</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>1) ST_Equals doesn't include an && check. I
know it seems like an oversight to me and I think Kevin even mentioned it and
possibly fixed it. Might be changed in 1.4 (and
maybe 1.3.4)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>2) With the self-join you have you lose the penalty
of of the correlated subquery, but sometimes a correlated subquery is faster and
I think this might be one of those cases especially if your subquery
</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>will return few records per record.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>I think you can simply do this</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2><SPAN
class=895400517-17112008>DELETE FROM mytable</SPAN></FONT></SPAN></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN
class=895400517-17112008>mytable.</SPAN>myidcolumn <SPAN
class=895400517-17112008><</SPAN></SPAN><SPAN
class=765451616-17112008> </SPAN></FONT></FONT></FONT></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN class=895400517-17112008>(</SPAN><SPAN
class=895400517-17112008>SELECT </SPAN> <SPAN
class=895400517-17112008>MAX(</SPAN><SPAN
class=895400517-17112008>m</SPAN>t1.myidcolumn <SPAN
class=895400517-17112008>)</SPAN></SPAN></FONT></FONT></FONT></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> <SPAN class=895400517-17112008>FROM</SPAN>
mytable mt1</FONT></SPAN><SPAN class=765451616-17112008><FONT face=Arial
color=#0000ff size=2> </FONT></SPAN></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN
class=895400517-17112008> </SPAN><SPAN
class=895400517-17112008>WHERE</SPAN> <SPAN
class=895400517-17112008>mt1.the_geom &&
mytable.the_geom </SPAN></SPAN></FONT></FONT></FONT></DIV>
<DIV><FONT face=Arial><FONT color=#0000ff><FONT size=2><SPAN
class=765451616-17112008><SPAN
class=895400517-17112008> AND
</SPAN><SPAN class=895400517-17112008>ST_Equals(mytable.the_geom,
mt1.the_geom)</SPAN></SPAN></FONT></FONT></FONT><SPAN
class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>)</FONT></SPAN></DIV></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Also ST_OrderingEquals might be faster than ST_Equals since
I think ST_Equals does more intensive testing and also does bizarre things with
invalid geometries. ST_OrderingEquals is more like a binary check, though
not an absolute. I guess it depends on how equal you are looking
for.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Hope that helps,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=895400517-17112008><FONT face=Arial
color=#0000ff size=2>Regina</FONT></SPAN></DIV><FONT face=Arial color=#0000ff
size=2></FONT><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Sufficool, Stanley<BR><B>Sent:</B> Monday, November 17, 2008 11:29
AM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Subject:</B> RE: [postgis-users]
Newbie question - remove duplicate /identicalfeature in
postgis<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>Sound
like this was never definitively answered, so here's my 2
cents.</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>1)
Create a unique field on your table AFTER importing with
shp2pgsql.</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>alter
table mytable add myidcolumn serial</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN><SPAN class=765451616-17112008><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2><SPAN
class=765451616-17112008><FONT face=Arial color=#0000ff size=2>2)
</FONT></SPAN>Then use the st_equals function to get rid of duplicates with
a lower serial.</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>delete
from mytable </FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>where
myidcolumn IN (</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> select t1.myidcolumn </FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> from mytable mt1, mytable mt2</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> where st_equals(mt1.the_geom,
mt2.the_geom)</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2> and mt1.myidcolumn <
mt2.myidcolumn</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2>)</FONT></SPAN></DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=765451616-17112008><FONT face=Arial color=#0000ff size=2>Make
sure you have an index on the geom column, or this may take a long
time.</FONT></SPAN></DIV>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV></DIV>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT
face=Tahoma size=2>-----Original Message-----<BR><B>From:</B>
postgis-users-bounces@postgis.refractions.net
[mailto:postgis-users-bounces@postgis.refractions.net] <B>On Behalf Of
</B>Intengu Technologies<BR><B>Sent:</B> Friday, November 14, 2008 8:00
AM<BR><B>To:</B> PostGIS Users Discussion<BR><B>Subject:</B> Re:
[postgis-users] Newbie question - remove duplicate / identicalfeature in
postgis<BR><BR></FONT></DIV>Dear Kevin & Jean<BR><BR>Thanks for the
assistance.<BR><BR>
<DIV class=gmail_quote>2008/11/13 Kevin Neufeld <SPAN dir=ltr><<A
href="mailto:kneufeld@refractions.net"
target=_blank>kneufeld@refractions.net</A>></SPAN><BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">Unfortunately
Jean, this won't work. Your unique constraint on the geometry column
is using the geometry equals operator (=), which compares the bounding boxes
of the geometries, *not* the geometries themselves.<BR><BR>I.E.<BR>postgis=#
select 'LINESTRING(0 0, 1 1)'::geometry = 'LINESTRING(1 1, 0
0)'::geometry;<BR> ?column?<BR>----------<BR> t<BR>(1
row)<BR><BR><BR>postgis=# create temp table tmp (geom geometry);<BR>CREATE
TABLE<BR><BR>postgis=# alter table tmp add constraint unique_geom unique
(geom);<BR>NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index
"unique_geom" for table "tmp"<BR>ALTER TABLE<BR><BR>postgis=# \d
tmp<BR> Table "pg_temp_2.tmp"<BR> Column | Type
| Modifiers<BR>--------+----------+-----------<BR> geom |
geometry |<BR>Indexes:<BR> "unique_geom" UNIQUE, btree
(geom)<BR><BR>postgis=# insert into tmp values ('LINESTRING(0 0, 1
1)'::geometry);<BR>INSERT 0 1<BR><BR>postgis=# insert into tmp values
('LINESTRING(1 1, 0 0)'::geometry);<BR>ERROR: duplicate key value
violates unique constraint "unique_geom"<BR><BR><BR>Sindile ,<BR>An
alternative way would be to load the data into PostGIS using the shape dump
as usual. If you have a distinct key (other than the gid which is
added by shp2pgsql) on your features, you could create a new table,
selecting from your postgis table using DISTINCT ON
(my_distinct_feature_key).<BR><A
href="http://www.postgresql.org/docs/8.3/static/sql-select.html#SQL-DISTINCT"
target=_blank>http://www.postgresql.org/docs/8.3/static/sql-select.html#SQL-DISTINCT</A><BR><BR>Or,
you could try:<BR>CREATE TABLE new_distinct_table AS<BR>SELECT att1, att2,
geom::geometry<BR>FROM (<BR> SELECT DISTINCT att1, att2, encode(geom,
'hex') as geom<BR> FROM my_polygon_table ) AS foo<BR><BR>Or, use Jean's
method, but instead of the unique constraint on the geom, use a unique
functional index using the geometry's hex string.<BR><BR>postgis=# CREATE
UNIQUE INDEX tmp_geom_unq ON tmp (encode(geom, 'hex'));<BR>CREATE
INDEX<BR><BR>postgis=# insert into tmp values ('LINESTRING(0 0, 1
1)'::geometry);<BR>INSERT 0 1<BR><BR>postgis=# insert into tmp values
('LINESTRING(1 1, 0 0)'::geometry);<BR>INSERT 0 1<BR><BR>postgis=# insert
into tmp values ('LINESTRING(1 1, 0 0)'::geometry);<BR>ERROR:
duplicate key value violates unique constraint
"tmp_geom_unq"<BR><BR>postgis=# \d tmp<BR> Table
"pg_temp_2.tmp"<BR> Column | Type |
Modifiers<BR>--------+----------+-----------<BR> geom | geometry
|<BR>Indexes:<BR> "tmp_geom_unq" UNIQUE, btree
(encode(geom::bytea, 'hex'::text))<BR><BR><BR>Remember though, that this is
*not* a spatial geometry index, just a unique index on the hex
string.<BR><BR>Hope that helps,<BR>Cheers,<BR><FONT
color=#888888>Kevin</FONT>
<DIV>
<DIV></DIV>
<DIV><BR><BR><BR><BR>Jean David TECHER wrote:<BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">Example:<BR><BR>dbname
= routing_db<BR><BR>table =wr<BR><BR>1. Load only the table structure not
the datas<BR><BR>use -p option taken from shp2pgsql<BR><BR>shp2pgsql -p
wr.shp wr|psql routing_db<BR><BR>2. Add a constraint on the_geom
column<BR><BR>psql routing_db<BR><BR>ALTER TABLE wr ADD CONSTRAINT
check_unique_the_geom unique(the_geom);<BR><BR>3. Load your datas without
'BEGIN' and 'END'<BR><BR>* Here use the -a option taken from
shp2pgsql<BR>* use -I for index creation<BR><BR>The idea is to use the
INSERT command in order to rollbacking if the<BR>current feature is
already in the table<BR><BR>N.B: don't use -D option!<BR><BR>shp2pgsql -aI
wr.shp wr|grep -v ^BEGIN|grep -v ^END|psql routing_db<BR><BR><BR>Quoting
Intengu Technologies <<A href="mailto:sindile.bidla@gmail.com"
target=_blank>sindile.bidla@gmail.com</A>>:<BR><BR>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">I
have a polygon shapefile that has duplicate features which I want to
load<BR>into Postgis. How can I ensure that each feature has only
one record.<BR><BR>-- <BR>Sindile
Bidla<BR><BR></BLOCKQUOTE><BR><BR><BR>===================<BR>Jean David
TECHER<BR><A href="http://www.davidgis.fr"
target=_blank>www.davidgis.fr</A><BR>06 60 46 85 05<BR>04 99 77 16
87<BR>===================<BR><BR>_______________________________________________<BR>postgis-users
mailing list<BR><A href="mailto:postgis-users@postgis.refractions.net"
target=_blank>postgis-users@postgis.refractions.net</A><BR><A
href="http://postgis.refractions.net/mailman/listinfo/postgis-users"
target=_blank>http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR></BLOCKQUOTE>_______________________________________________<BR>postgis-users
mailing list<BR><A href="mailto:postgis-users@postgis.refractions.net"
target=_blank>postgis-users@postgis.refractions.net</A><BR><A
href="http://postgis.refractions.net/mailman/listinfo/postgis-users"
target=_blank>http://postgis.refractions.net/mailman/listinfo/postgis-users</A><BR></DIV></DIV></BLOCKQUOTE></DIV><BR><BR
clear=all><BR>-- <BR>Sindile Bidla<BR></BLOCKQUOTE></BODY></HTML>
<HTML><BODY><P><hr size=1></P>
<P><STRONG>
The substance of this message, including any attachments, may be confidential, legally privileged and/or exempt from disclosure pursuant to Massachusetts law. It is intended solely for the addressee. If you received this in error, please contact the sender and delete the material from any computer.
</STRONG></P></BODY></HTML>
<P><hr size=1></P>
<P><STRONG><font size="2" color="339900"> Help make the earth a greener place. If at all possible resist printing this email and join us in saving paper. </p> <p> </font></STRONG></P>