[GRASS-user] Correcting data errors

Markus Metz markus.metz.giswork at gmail.com
Tue Jan 13 11:41:07 PST 2015


On Sat, Jan 10, 2015 at 7:34 PM, Fábio Dias <fabio.dias at gmail.com> wrote:
> Hello again,
>
> This issue "arised" from my other thread ('v.generalize: does it take
> forever?'). Since I think it would be bad practice to mix issues on
> the same thread, I'm opening this one.
>
> I have a dataset that corresponds to land usage on the 'legal amazon
> region', the region that are legally recognized as related to the
> amazon forest, even if there's not much there at the moment. This
> dataset contain information of this region for 3 years already (2008,
> 2010, 2012) and 2014 is underway.
>
> However, in the process of developing a web interface for this
> dataset, I had the need to generalize that information, to reduce the
> volume of data considering the current viewport. AFAIK, the only
> proper way of doing that is topologically, so you don't end up with
> gaps and overlaps.
>
> PostGIS can do something like this, on the 2+ version, but my problem
> is that the data is full of topological errors. I can't really call
> them 'data errors', since the toolchain considered when the dataset
> was created didn't have these restrictions. I'm going further than it
> was expected when the process started. But they are topological errors
> that I need to deal with.
>
> As far as I saw, there are gaps between polygons, self-intersections,
> bridges, etc etc. I'm fairly certain that there is at least one
> occurrence of each type of error known to man. Matter of fact, a lot
> of the polygons are not ST_Valid either.
>
> I've read the manual and fussed with v.clean, trying with bpol, break,
> rmdupl, etc. But I still don't really have the feeling that I know
> what I'm doing. I lack experience with this, so:
>
> TL;DR:
> What would be the 'recommended' way of dealing with the 'errors' made
> when the data is created with zero topological restrictions (and saved
> as .shp?)

For import, try to find a snapping threshold for v.in.ogr that
produces an error-free output. Ideally the output would not only be
error-free, but the number of centroids would match the number of
input polygons (both are reported by v.in.ogr). The min_area option of
v.in.ogr could also help. The bulk of the cleaning should be done by
v.in.ogr. After that, removing small areas with v.clean tool=rmarea,
threshold in square meters, could help. For Terraclass (and PRODES)
which are mainly based on Landsat data, a threshold of 10 square
meters could remove artefacts and preserve valid areas (Landsat pixel
size is about 90 square meters). The threshold needs to be empirically
determined.

I am not aware of a standard procedure that works for all data sources.

Markus M

>
> Thanks again,
>
> F
> -=--=-=-
> Fábio Augusto Salve Dias
> http://sites.google.com/site/fabiodias/
> _______________________________________________
> grass-user mailing list
> grass-user at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-user


More information about the grass-user mailing list