[postgis-users] Newbie question - remove duplicate / identical feature in postgis
Kevin Neufeld
kneufeld at refractions.net
Thu Nov 13 09:57:47 PST 2008
Unfortunately Jean, this won't work. Your unique constraint on the geometry column is using the geometry equals
operator (=), which compares the bounding boxes of the geometries, *not* the geometries themselves.
I.E.
postgis=# select 'LINESTRING(0 0, 1 1)'::geometry = 'LINESTRING(1 1, 0 0)'::geometry;
?column?
----------
t
(1 row)
postgis=# create temp table tmp (geom geometry);
CREATE TABLE
postgis=# alter table tmp add constraint unique_geom unique (geom);
NOTICE: ALTER TABLE / ADD UNIQUE will create implicit index "unique_geom" for table "tmp"
ALTER TABLE
postgis=# \d tmp
Table "pg_temp_2.tmp"
Column | Type | Modifiers
--------+----------+-----------
geom | geometry |
Indexes:
"unique_geom" UNIQUE, btree (geom)
postgis=# insert into tmp values ('LINESTRING(0 0, 1 1)'::geometry);
INSERT 0 1
postgis=# insert into tmp values ('LINESTRING(1 1, 0 0)'::geometry);
ERROR: duplicate key value violates unique constraint "unique_geom"
Sindile ,
An alternative way would be to load the data into PostGIS using the shape dump as usual. If you have a distinct key
(other than the gid which is added by shp2pgsql) on your features, you could create a new table, selecting from your
postgis table using DISTINCT ON (my_distinct_feature_key).
http://www.postgresql.org/docs/8.3/static/sql-select.html#SQL-DISTINCT
Or, you could try:
CREATE TABLE new_distinct_table AS
SELECT att1, att2, geom::geometry
FROM (
SELECT DISTINCT att1, att2, encode(geom, 'hex') as geom
FROM my_polygon_table ) AS foo
Or, use Jean's method, but instead of the unique constraint on the geom, use a unique functional index using the
geometry's hex string.
postgis=# CREATE UNIQUE INDEX tmp_geom_unq ON tmp (encode(geom, 'hex'));
CREATE INDEX
postgis=# insert into tmp values ('LINESTRING(0 0, 1 1)'::geometry);
INSERT 0 1
postgis=# insert into tmp values ('LINESTRING(1 1, 0 0)'::geometry);
INSERT 0 1
postgis=# insert into tmp values ('LINESTRING(1 1, 0 0)'::geometry);
ERROR: duplicate key value violates unique constraint "tmp_geom_unq"
postgis=# \d tmp
Table "pg_temp_2.tmp"
Column | Type | Modifiers
--------+----------+-----------
geom | geometry |
Indexes:
"tmp_geom_unq" UNIQUE, btree (encode(geom::bytea, 'hex'::text))
Remember though, that this is *not* a spatial geometry index, just a unique index on the hex string.
Hope that helps,
Cheers,
Kevin
Jean David TECHER wrote:
> Example:
>
> dbname = routing_db
>
> table =wr
>
> 1. Load only the table structure not the datas
>
> use -p option taken from shp2pgsql
>
> shp2pgsql -p wr.shp wr|psql routing_db
>
> 2. Add a constraint on the_geom column
>
> psql routing_db
>
> ALTER TABLE wr ADD CONSTRAINT check_unique_the_geom unique(the_geom);
>
> 3. Load your datas without 'BEGIN' and 'END'
>
> * Here use the -a option taken from shp2pgsql
> * use -I for index creation
>
> The idea is to use the INSERT command in order to rollbacking if the
> current feature is already in the table
>
> N.B: don't use -D option!
>
> shp2pgsql -aI wr.shp wr|grep -v ^BEGIN|grep -v ^END|psql routing_db
>
>
> Quoting Intengu Technologies <sindile.bidla at gmail.com>:
>
>> I have a polygon shapefile that has duplicate features which I want to
>> load
>> into Postgis. How can I ensure that each feature has only one record.
>>
>> --
>> Sindile Bidla
>>
>
>
>
> ===================
> Jean David TECHER
> www.davidgis.fr
> 06 60 46 85 05
> 04 99 77 16 87
> ===================
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list