ESRI Shape Output (v.out.shape) !

Thu May 18 10:16:57 EDT 2000

"Eric G . Miller" wrote:
> 
> On Wed, May 17, 2000 at 01:56:42PM -0700, Rich Shepard wrote:
> <snipped>
> >   I'm starting to design the MI<->GRASS filters from the MI side. I'll be
> > very happy to use your code to get the data into GRASS once I get it out of
> > MI, or vice-versa. No sense re-inventing the wheel, as you noted.
> >
> >   The MI file format is very well documented and is easy to parse into data
> > structures. The trick -- for me at least -- is understanding the GRASS data
> > structures so I can put the right values in the proper files.
> >
> >   The other hang up is my understanding of the "arc-node" format in
> > practical terms. A MI region (polygon) is a list of nodes with the first and
> > last nodes having identical values. I've no idea how to separate this one
> > line into meaningful segments (arcs) for input into GRASS. I'm certainly
> > open to ideas, thoughts or solutions, for I've never had to deal with this
> > before.
> 
> This is part of the drag of both the shapefile and mif formats.  Here's
> how ESRI itself deals with the problem in ARC/INFO -- it imports
> polygons (and lines?) as "regions".  Then the data has to be
> built/cleaned.  The problem with these "tuple" style formats is they
> don't "share" data between objects.  They're essentially vector
> graphics.  The arc-node idea is everything is a remapping of
> relationships of points.  A line segment is a remapping of a start node,
> a series of vertices and an end node. Polygons are mappings of line
> segments (arcs) -- minimum of 3.  So a particular node can be shared by
> multiple objects, and the same for a particular arc (or edge).  Note, a
> node has a particular definition as having at least 3 connecting line
> segments.  To get around that, there's a concept of a "psuedo node" to
> handle the cases of island/hole polygons and lines that don't terminate
> at a junction with another line (such as a cul-de-sac).  The problem is
> the shapefile or mif can have overlapping polygons, unsplit line
> segments and other such no-no's (I don't know if MapInfo has an idea of
> cleaning data, but ArcView sure doesn't).

Eric

If I may contribute my own 2 euro to these points, as I have actually
been working with them:-

That is very interesting. It is in essence how I dealt with the
shapefile import. It's fairly inuitive really, once you've dived
into the problem, and had a few bum steers. A vertex base is built
using a standard database structure ( = ~SDTS `network' ) which stores
all the vertices and links. Arcs are created by tracking from one
node to another, nodes defined as having other than 2 links. Finding
area points is fun (99.99% accurate now), but quite slippery. A
second pass picks out the simple islands.

> 
> While it's a bit of work, you could check for errors such as sliver
> polygons, unsnapped nodes, etc. in the conversion.  When such errors are
> within a suitable tolerance, you could correct them without warning.  If
> they're greater than the tolerance (such as large slivers), you could
> create a new polygon with a bogus category value and warn the user that
> the data contains slivers.  But this may be more work than you want to
> do (grass's v.support/v.digit can help the user with this).

This is done also - partially. The version in CVS has support for a user
defined snap distance. The latest upgrade, which is finished and now
being
debugged, not yet in CVS, has a colinearity (sliver) tolerance at
nodes which is also user defined. A future revision will have a `reject'
file associated with each imported map that will indicate problem areas
(and can be used as an overlay in v.digit, allowing manual corrections).

So I think we have the basis for dealing with imports of `geometric'
format coverages. Also location of area/label points and attribution
(also postgres support for multiple attributes thanks to Alex
Shevlakov).

> 
> Maybe, the solution is to create a temporary file or two, map out all of
> the points in the input file, then make another pass (or two) to map out
> the edges and areas (if you think you can "get" enough memory, that's,
> obviously, much faster).

Yes. This process is a memory hog. You are creating large structures for
each vertex, never mind all the other topological entities. I find
3,000 - 5,000 polygons/128 MB RAM is all you get before you start 
using swap. You can manage a proportional amount using swap, but
performance degrades rapidly with size. I think we can import files
of arbitrary size, but the source file would have to be divided into
bands that would be imported independently to temp files, then merged
with v.patch.

> I'm not familiar enough with GRASS's format or
> interface or the MIF format to comment on the specifics of the
> translation.  I've, of course, been thinking of the 2d/2.5d world in
> this discussion ;)

Yes you might call it a 2d projected surface with an option for height 
as a scalar field.

David