[GRASS5] Re: Vector Manipulation

Fri Jun 7 05:47:15 EDT 2002

On Thursday 06 June 2002 05:28 pm, Christoph Simon wrote:
> When dealing with software which I don't know well, I generally prefer
> the stable branch to avoid bad surprises while learning. But each
> project has a different level of unstability. Would you think it's
> high risk for me to go to grass51?
> 
> Does the programming manual PDF couver the new vector features? If
> not, Where can I get those?

I don't know what is high risk for you. Grass51 is at the very beginning,
changing from day to day (BTW for g51 testers: yesterday, I have changed 
'topo' format -> necessary to rebuild topology for old vectors - v.build)
and new API is not yet documented. On the other hand, it already contains
richer enviroment than grass50. Basic functions are either identical
or similar and where you need more, you have to access undocumented
internal structures in grass50, you can use undocumented functions in
grass51, for example:
grass50: n1 = map->Line[line].N1; n2 = map->Line[line].N2;
grass51: Vect_get_line_nodes ( map, line, &n1, &n2);
You can also compare:
http://freegis.org/cgi-bin/viewcvs.cgi/grass/src/include/Vect.h?rev=1.4&content-type=text/vnd.viewcvs-markup
http://freegis.org/cgi-bin/viewcvs.cgi/grass51/include/Vect.h?rev=1.16&content-type=text/vnd.viewcvs-markup
What is higher risk?

New features are described in:
http://grass.itc.it/grass51/index.html
http://freegis.org/cgi-bin/viewcvs.cgi/~checkout~/grass51/doc/vector/vector.html
and testing data set is here:
http://mpa.itc.it/radim/g51/  (unpack, start grass51 and run ./tour)

> ... , if you tell me that I need it,
> I'll replace my grass5 by grass51 and start studying the programmers
> manual.

I would say that you need it, for this task will be probably less painful
to use grass51. 

> > I would simply start with some most common, loading user functions may
> > be probably postponed.
>
> Right, but if this is a functionality planned for a later step, the
> design should allow for it from the beginning. Even if the parser
> doesn't do so, it shouldn't require a total rewrite to make a (hash)
> table lookup for a function which might need dynamic loading. This
> could be done by installing all `factory' functions already in such a
> hash table and make the lookup for everything which by syntax needs to
> be an operator. This is not too hard for flex/bison. The locating and
> loading of these function certainly can wait.

List of operators should not be endless, there are just few basic and
others are created by combination of these:
AND(intersect), OR(+), NOT(-), XOR, EQUAL,
WITHIN, COMPLETELY_WITHIN, CONTAINS, COMPLETELY_CONTAINS,
TERMINATES_IN, CONTAINS_END_OF, TOUCH, ... (few others)

But if you are capable to create enviroment mentioned above, why not.

> OK. Then we would have a parser, interpreting the commandline, much
> the like r.mapcalc works (as I guess), and that will dispatch to a
> function for each operation, which will output an intermediate or
> final result. You will open the files according the parser's results,
> look up the functions and operators, and call that function with the
> needed data. Right?

Yes.

> > I'll try to start with infrastructure definition
> > (for now I'll consider just about two overlayed vector maps):
>
> This is certainly a good restriction for now. But if we create
> temporary maps for each atomic functions, even if it's not always very
> efficient, more maps wouldn't mean more programming work.

Yes (note: support for temporary vector maps not yet written).

> > I) Variants of results we want to get from overlay:
> >   a) list of IDs(internal numbers) of elements in one map, which fulfil
> >      operator condition (if no breaking required)
> >   b) list of elements (i.e. list of new line_pnts structures created by
> >      intersecting)
> >   c) list of IDs (for both maps) + sizes (length or area) - if just
> >   report
> >      is required, for example which lines intersect which areas and how
> >      long are these intersections
>
> Sounds good. I guess the basic vector functions already exist to
> compute permiter and area, right? They just need to be called here.

There are for example: Vect_get_area_area (), Vect_line_length (),
Vect_line_geodesic_length (); 

> > II) How to do that? I see 2 ways how to analyse vectors:
> >   a) 1) Overlay whole maps and create new map (intersect all lines
> >         and areas) and save as standard grass vector map where for
> >         new elements will be saved its origin (area 456 in MapA and
> >         line 123 in MapB
> >      2) Go through all new elements and select that which fits the
> >      operator
> >   b) Go through all elements of map and check if fulfil required rule
>
> I would probably try to choose version b) as it sounds more
> efficient. In version a) I imagine that we get first a map which will
> essentially duplicate maps A and B, eliminating then those elements
> which need to be discarded. In case of version b) this shouldn't be
> the case, but I imagine, that in the end, this might be a tradeoff
> between memory requirements and processing power, as in case of
> version b) eventually more comparisons need to be done in certain
> cases.

Yes. Advantage of a) is that you can overlay once (which may take also
hours or days) and then you can do many (quick) queries. Disadvantage
is that often just overlay of few elements is needed and user must take
care of new map. In addition, a) is not too suitable for interactive
queries (on monitor only). So, let us 'try' b).

> > Function (probably only suitable for II-b):
> > Vect_analyse/overlay (
> >   struct Map_info *MapA,
> >   struct ilist *ListA,   // list of elements in MapA to operate on
> >   int typeA,             // type of elements in MapA to operate on
> >   struct Map_info *MapB,
> >   struct ilist *ListB,
> >   int typeB,
> >   int operator,          // AND, WITHIN, COMPLETELY_WITHIN, ....
> >   int result_type,       // Ia, Ib, Ic above
> >   void *List)            // list of results
>
> I'll have to dive into the programmers manual to get familiarized with
> the specific types. So guessing only: MapAB is probably the header
> info and ListAB the actual map data.

ListAB is just list of feature IDs used by 
Vect_read_line (map,points,cats,ID);
I'am not yet sure how this list should look like, I needed list, so I wrote 
that, but should be probably done in more 'systematic' way G_list_*
(that is why I asked about GLib).

> typeAB would be either area,
> line, or point. Operator is specified by an int (or enum). 
Yes.

> But why is
> the resulting List a void pointer?

void because I did not know which type it should be. Because we want
3 types of results (Ia, Ib, Ic), list of results well be in 3 types:
Ia) iList *
Ib) featureList * {
      struct line_pnts *Points,
      struct line_cats *Cats,
      int n_features, alloc_features
}
Ic) reportList? * {
      int *idA, *idB,
      double *size,
      int n_items
} 
But featureList, reportList are hypothetical and do not exist at present.

> Shouldn't it be also a pair of Map_info and ilist? 
No, because, sometimes the result will not be written to new map.
For example interactive module, which enables to select features
by polygon drawn on monitor, wants first to display selection before it writes
new map.

This also reminded me, that also input in form:
   struct Map_info *MapA,
   struct ilist *ListA
is not always best, because it may be only one polygons created
on runtime. We cannot use featureList, because it would not be effective
for maps => ?

> Also, the result_type can always be only one or is
> this a bitfield?

One at a time, no bits.

> I guess this function returns an integer to indicate
> success or failure (for instance to allocate the memory for the
> resulting map or for lacking disk space, having no write permissions,
> etc., to write the result).

Don't know yet what should return, probably number of items in results
and <0 for error.

Radim