[GRASS-dev] [SoC] Status Report: Reimplementation of v.voronoi and v.delaunay in GRASS GIS

Sun Jun 29 10:56:26 EDT 2008

Hi all,

First of all, I would like to apologise for not sending the report on
Friday. This week was very busy for me, since I was moving away from England
and I needed to sort various matters before I left, which kept me immensely
occupied.

However, I had a fruitful conversation with Paul Kelly(my mentor). We were
discussing algorithms I work with in detail. I have found out that the best
way to fully understand something is to try to explain it to somebody else.

I feel that I have acquired enough theoretical knowledge so that I can
implement Guibas-Stolfi algorithm by the end of next week.

I know that many of you are really interested in the progress of SoC
projects and some of you are even willing to give a hand to SoC students.
For those who are interested, I attach a short summary of the conversation
with my mentor below, which might be helpful to anybody who wants to be put
on the right track about the progress of the project.

Wolf Bergenheim recommended to me to have a look at Delaunay triangulations
in TIN creation: an overview and a linear-time algorithm (
http://tinyurl.com/3s4s4h). Unfortunately, my college is not subscribed to
informaworld. If anybody manages to get an electronic version of this paper,
could you send it to me. If not, then don't worry about that. I have plenty
of other resources dealing with algorithms used for construction of DT&VD. I
have uploaded most of it
here<http://www.doc.ic.ac.uk/%7Emp107/delaunay-fun.zip>(7MB). You can
have a look at those papers if you wish to. One of the
disadvantage of some of those algorithms is that they are not so widely used
and only a few implementations can be found online. The algorithms are also
presented at a quite high theoretical level, which leaves a lot of
implementation decisions on programmer, which is blessing and curse at the
same time. They also require you to understand the theory used behind
(mainly quad-edge data structure), which is a nice challenge.

I am primarily focusing on two particular algorithms. Divide&Conquer
algorithm by Guibas-Stolfi and plane-sweep algorithm by Fortune. They both
run in *N log N* worst case time. D&C algorithm was initially presented
using quad-edge data structure, which has many nice properties. For
instance, every single edge stores also its dual edge, which means that a
dual graph of the stored graph can be easily obtained in linear time. In
practice if you have constructed DT you can get VD in linear time just by
flipping every edge to its dual, and vice versa for VD->DT(using the fact
that DT and VD are dual to each other). Another nice property is that
topological representation is entirely separated from geometrical one.
Therefore altering one keeps the other intact. Which I consider elegant from
programmer's point of view. However, those properties make this data
structure quite heavy in terms of memory usage, as Wolf have correctly
pointed out. He also suggested to use winged-edge instead, which is much
less memory demanding. Incidentally, one of the papers that is in the bundle
I have given you above, named *Improvements to Guibas-Stolfi and Fortune.pdf
* have proposed the same enhancement. This was driven mainly by observation
that traditional implementation of Guibas-Stolfi with quad-edge spends about
55% of the time of computation on various edge manipulating procedures. In
practice, winged-edge version of G-S uses less memory and is faster than its
quad-edge counterpart. Yet this costs us an option to easily switch between
graph and its dual. I plan to implement two versions of G-S algorithm. At
first I want to implement quad-edge version, since this one was described in
original paper and should be *straightforward*. Then I will make another
version with winged-edge. I plan to give user an option as an optional
parameter to choose if he/she wants to be able to obtain also dual graph, if
yes then quad-edge will be used, otherwise winged-edge. Default should be
winged-edge for efficiency reasons.

In short, G-S works like this:

   1. Sort sites along x-axis with ties resolved by y-coordinate (this makes
   all future splittings constant time operations)
   2. Recursively divide the sites into left(L) and right(R) vertical slips.
   If a current region contains 2 or 3 sites, construct DT directly, otherwise
   recurse further.
   3. Merge L-DT and R-DT (widely uses inCircle test)

The most complex of those steps is merge, which i am not going to explain,
since it has many substeps. It is nicely described (with pictures) in
*Guibas-Stolfi
divide-and-conquer.pdf* in the bundle above.

One major improvement to G-S is due by Dwyer, which improves expected time
to *N log log N* for large number of distributions including uniform
distribution in unit square. Dwyer advocates splitting a set of sites into
cells = squares with sides of length sqrt((log N)/N) (assuming the sites are
in unit square). Then you construct DT in each cell using traditional G-S
algorithm. In the next step you start merging DTs in cells in rows using
exactly the same method used for merging in G-S algorithm. You merge(+)
cells in this order: (0+1, 2+3, 4+5, ..., n+(n+1), ...) storing results in
cells (0, 2, 4, ...) then you merge (0+2, 4+6, ..., (2k)+(2k+2), ...)
storing results in (0, 4, 8, ...). You repeat this. At the end, a
triangulation of each row is stored in the cell at position 0 in that row.
Now you start merging rows in similar fashion. However, slightly modified
version of the merge proceduce has to be used this time, since you are
merging along bottom/top edges. Final DT can be find in the cell (0,0).

Nonetheless, modified version of the above algorithm(name it DW2) is
implemented in practice, but the above is used to describe how the algorithm
works and for analysis of its efficiency.
In DW2 you split the set of N sites into *sqrt(N/(log N))* slips by line
parallel to x-axis. Then you apply G-S to each slip. Remember that G-S
splits sites recursively into L and R halves by line parallel to y-axis -
this conspicuously resembles the above algorithm, but is somewhat easier to
implement. Next step is the same as in algorithm above, you merge rows.
Efficiency is rougly the same. Both run in expected time *N log log N*. If I
have enough time I would like to implement DW2.

There are many other enhancements and speed ups mentioned in *Improvements
to **Guibas-Stolfi and Fortune.pdf** *One notable is an improved inCircle
test which is largely used in the Merge procedure. Usual version of
computing determinant which after elimination of common subexpressions takes
45 arithmetic operations is improved to only 23 operations by using the test
based on y-coordinate of circumcircle centers. This can be further improved
by caching, since current top-most cross edge between L and R slips remains
the same for one iteration, some of the subexpressions remain constant and
precalculating them can decrease the number of operations of inCircle to 15.

I have yet to study Fortune's plane-sweep algorithm in more detail. There
are many improvements applicable to plane-sweep algorithm also.

Regards,
Martin Pavlovsky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/grass-dev/attachments/20080629/013073f1/attachment-0001.html