[postgis-users] Newbie question - remove duplicate / identicalfeature in postgis
Sufficool, Stanley
ssufficool at rov.sbcounty.gov
Mon Nov 17 08:28:45 PST 2008
Sound like this was never definitively answered, so here's my 2 cents.
1) Create a unique field on your table AFTER importing with shp2pgsql.
alter table mytable add myidcolumn serial
2) Then use the st_equals function to get rid of duplicates with a lower
serial.
delete from mytable
where myidcolumn IN (
select t1.myidcolumn
from mytable mt1, mytable mt2
where st_equals(mt1.the_geom, mt2.the_geom)
and mt1.myidcolumn < mt2.myidcolumn
)
Make sure you have an index on the geom column, or this may take a long
time.
-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
Intengu Technologies
Sent: Friday, November 14, 2008 8:00 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Newbie question - remove duplicate
/ identicalfeature in postgis
Dear Kevin & Jean
Thanks for the assistance.
2008/11/13 Kevin Neufeld <kneufeld at refractions.net>
Unfortunately Jean, this won't work. Your unique
constraint on the geometry column is using the geometry equals operator
(=), which compares the bounding boxes of the geometries, *not* the
geometries themselves.
I.E.
postgis=# select 'LINESTRING(0 0, 1 1)'::geometry =
'LINESTRING(1 1, 0 0)'::geometry;
?column?
----------
t
(1 row)
postgis=# create temp table tmp (geom geometry);
CREATE TABLE
postgis=# alter table tmp add constraint unique_geom
unique (geom);
NOTICE: ALTER TABLE / ADD UNIQUE will create implicit
index "unique_geom" for table "tmp"
ALTER TABLE
postgis=# \d tmp
Table "pg_temp_2.tmp"
Column | Type | Modifiers
--------+----------+-----------
geom | geometry |
Indexes:
"unique_geom" UNIQUE, btree (geom)
postgis=# insert into tmp values ('LINESTRING(0 0, 1
1)'::geometry);
INSERT 0 1
postgis=# insert into tmp values ('LINESTRING(1 1, 0
0)'::geometry);
ERROR: duplicate key value violates unique constraint
"unique_geom"
Sindile ,
An alternative way would be to load the data into
PostGIS using the shape dump as usual. If you have a distinct key
(other than the gid which is added by shp2pgsql) on your features, you
could create a new table, selecting from your postgis table using
DISTINCT ON (my_distinct_feature_key).
http://www.postgresql.org/docs/8.3/static/sql-select.html#SQL-DISTINCT
Or, you could try:
CREATE TABLE new_distinct_table AS
SELECT att1, att2, geom::geometry
FROM (
SELECT DISTINCT att1, att2, encode(geom, 'hex') as geom
FROM my_polygon_table ) AS foo
Or, use Jean's method, but instead of the unique
constraint on the geom, use a unique functional index using the
geometry's hex string.
postgis=# CREATE UNIQUE INDEX tmp_geom_unq ON tmp
(encode(geom, 'hex'));
CREATE INDEX
postgis=# insert into tmp values ('LINESTRING(0 0, 1
1)'::geometry);
INSERT 0 1
postgis=# insert into tmp values ('LINESTRING(1 1, 0
0)'::geometry);
INSERT 0 1
postgis=# insert into tmp values ('LINESTRING(1 1, 0
0)'::geometry);
ERROR: duplicate key value violates unique constraint
"tmp_geom_unq"
postgis=# \d tmp
Table "pg_temp_2.tmp"
Column | Type | Modifiers
--------+----------+-----------
geom | geometry |
Indexes:
"tmp_geom_unq" UNIQUE, btree (encode(geom::bytea,
'hex'::text))
Remember though, that this is *not* a spatial geometry
index, just a unique index on the hex string.
Hope that helps,
Cheers,
Kevin
Jean David TECHER wrote:
Example:
dbname = routing_db
table =wr
1. Load only the table structure not the datas
use -p option taken from shp2pgsql
shp2pgsql -p wr.shp wr|psql routing_db
2. Add a constraint on the_geom column
psql routing_db
ALTER TABLE wr ADD CONSTRAINT
check_unique_the_geom unique(the_geom);
3. Load your datas without 'BEGIN' and 'END'
* Here use the -a option taken from shp2pgsql
* use -I for index creation
The idea is to use the INSERT command in order
to rollbacking if the
current feature is already in the table
N.B: don't use -D option!
shp2pgsql -aI wr.shp wr|grep -v ^BEGIN|grep -v
^END|psql routing_db
Quoting Intengu Technologies
<sindile.bidla at gmail.com>:
I have a polygon shapefile that has
duplicate features which I want to load
into Postgis. How can I ensure that
each feature has only one record.
--
Sindile Bidla
===================
Jean David TECHER
www.davidgis.fr
06 60 46 85 05
04 99 77 16 87
===================
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
--
Sindile Bidla
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20081117/7415063b/attachment.html>
More information about the postgis-users
mailing list