[GRASS-user] Unable to get rid of duplicate polygons

Markus Metz markus.metz.giswork at googlemail.com
Wed Apr 4 03:34:16 EDT 2012


There seems to be some confusion here about what v.in.ogr does or does
not and the relation of attributes to geometries.

v.in.ogr does not remove duplicate polygons, it does not even check
for duplicate polygons, and most importantly it keeps (should keep)
all polygons present in the input layer(s) and represents them as
topological areas composed from boundaries. In case of overlapping
polygons, the overlapping parts are marked as such by assigning
multiple categories to them, one for each original polygon.
Additionally, The number of features for those areas is stored as
category in layer X, X being a number reported by v.in.ogr. In order
to get rid of duplicate polygons, one of the categories needs to be
removed from the corresponding area. AFAICT, this needs to be done
manually with one of the vector digitizers.

Deleting a row in the attribute table does not delete the
corresponding geometry or geometries, or to be more precise, the
corresponding category value from geometries. Likewise, deleting a
geometry does not necessarily delete the corresponding entry (entries)
in the attribute table.

Markus M


On Mon, Apr 2, 2012 at 10:11 PM, Paulo van Breugel
<p.vanbreugel at gmail.com> wrote:
> Keeping both attributes would make sense in many cases of overlapping
> polygons, which, I would guess, is more often about partly overlapping
> polygons due to sloppy digitizing. How is GRASS going to tell what feature
> is more important?
>
> But, in case of duplicate rows in your attribute table, except for the cat,
> you might be able to remove the duplicate rows in the attribute table using
> a simple SQL statement (not sure this works if you use dbf as database
> backend), something along the lines of (google if this doesn't work).
>
> DELETE FROM table WHERE cat NOT IN
> (SELECT MIN(cat) FROM table GROUP BY XXX);
>
> Whereby XXX would be column with unique values mapping unit (if there is not
> such column, you need to GROUP on a set of columns that together uniquely
> define each mapping unit). You can do this in the db.execute. Alternatively,
> you can use the SELECT statement in the advanced SQL query builder in the
> GRASS Attribute Table Manager to select the duplicates and delete them
> there.
>
> If you use the dbf as database backend and the above doesn't work, you can
> open the dbf file (which you can find in the 'GRASS DB / LOCATION/ MAPSET
> /dbf' folder) in Libreoffice and select all duplicates and delete, e.g.,
> using a pivot table. Do not use excel, in my experience that may mess up
> your dbf file.
>
> In all cases, this is just to remove rows that are identical except for one
> column... you'll have to test whether the results make sense in your case
> and you are not messing up your polygon layer.
>
>
>
>
>
> On 04/02/2012 09:09 PM, David J. Bakeman wrote:
>
> Markus Metz wrote:
>
> On Sun, Apr 1, 2012 at 11:45 PM, David J. Bakeman<dbakeman at comcast.net>
> wrote:
>
>
> David J. Bakeman wrote:
>
>
> Markus Neteler wrote:
>
>
> On Sun, Apr 1, 2012 at 9:45 PM, David J. Bakeman<dbakeman at comcast.net>
>   wrote:
>
>
> I didn't find an answer in the archives so.
>
> I have a shapefile of polygons and some of the polygons are duplicated.
>   I
> thought I could use v.clean tool=rmdupl to get rid of these polygons.  I
> use
> v.in.ogr to read it in and I get the following:
>
> WARNING: 8 areas represent more (overlapping) features, because polygons
>           overlap in input layer(s). Such areas are linked to more than 1
>           row in attribute table. The number of features for those areas
> is
>           stored as category in layer 2
>
> That is correct in that there are 8 duplicate polygons but the only
> different attribute is the cat which grass added?  What am I missing?  I
> then tried v.clean tool=bpol,rmdupl and nothing changes it still has the
> 8
> duplicates.  What am I doing wrong?
>
>
> I think that you need to add the break tool for v.clean.
>
>
> Correct that was actually what I was using:  v.clean tool=break,rmdupl
>
> Looking closer I see that when I run v.clean it doesn't even report the
> duplicates that v.in.ogr did but they are still there.  The only thing that
> differs in coordinates or attributes is the cat attribute that grass added.
>
>
> I am using grass 6.3.0 on fedora core 14 linux.
>
>
> Please note that you can upgrade to grass-6.4.0-4.fc14:
> http://koji.fedoraproject.org/koji/buildinfo?buildID=263115
>
>
> Thanks I'll see if I can upgrade.
>
>
> I upgraded to the 6.4.0 and the results are exactly the same.  The polygons
> really are identical in every respect except for they have different values
> in the cat column.  Is there some other grass tool for removing this kind of
> duplicate?
>
>
> After import with v.in.ogr, there are no duplicate geometries left in
> the vector. What you have now is some areas with two categories
> assigned to them. Removing the duplicates means in this case removing
> one of the two categories, for example with one of the vector
> digitizers.
>
>
> I'm relatively new to grass but that doesn't make sense.  I started with a
> shapefile with duplicate features.  That is polygons with the exact same
> attributes and geometry (they are identical).  What I thought grass could do
> for me was to read it in and delete one of the duplicates without user
> intervention.  After all it identifies the duplicates so why can't it delete
> one?
>
> Are you saying that the duplicate geometry was deleted but it kept both rows
> even though they were identical as well?  Is there a operation that would
> identify and delete rows that differ in only the cat attribute.
>
> Markus M
>
>
>
>
> _______________________________________________
> grass-user mailing list
> grass-user at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-user
>
>
> You can use
>
> _______________________________________________
> grass-user mailing list
> grass-user at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-user
>


More information about the grass-user mailing list