[GRASS5] GRASS 5.1 vectors ... trees .vs. tables

Radim Blazek blazek at itc.it
Mon Jun 23 03:58:43 EDT 2003


On Friday 20 June 2003 23:47, Greg Sepesi wrote:
> Six weeks ago I started using GRASS, with an interest in developing code
> to conflate maps (in particular, getting the label information from
> TIGER maps into USGS DLG maps).  I've been using GRASS 5.0, and have
> briefly experimented with the GRASS 5.1 dglib vector library.  A bit
> overwhelmed by dglib, I decided to augment a vector library I had
> previously written.  The resulting vector library, which I've called
> dgtable, presently implements the common variants of the priority first
> search (i.e., breadth first search, depth first search, minimum spanning
> tree, and shortest path algorithms) as described by Sedgewick.  I
> believe dglib is designed to do the same.
>
> Comparing the two vector libraries, the most obvious difference is
> size.  The files and line counts are listed below.
>
> dglib files                  lines
> ===========                  =====
> avl.c                       :  891
> avl.h                       :  116
> dgl.h                       :    8
> edgemgmt-template.c         :  404
> graph.c                     : 1703
> graph.h                     :  547
> graph_v1.c                  :  414
> graph_v1.h                  :  273
> graph_v2.c                  :  418
> graph_v2.h                  :  279
> heap.c                      :  179
> heap.h                      :   92
> helpers.c                   :  155
> helpers.h                   :   38
> misc-template.c             :  651
> nodemgmt-template.c         :  443
> span-template.c             :  423
> sp-template.c               :  567
> tavl.c                      :  982
> tavl.h                      :  119
> tree.c                      :  351
> tree.h                      :  172
> type.h                      :   72
> v1-defs.h                   :  183
> v2-defs.h                   :  186
>
> TOTAL LINES:                : 9666
> AVG CHARS / LINE:           : 20.4
>
>
> dgtable files                lines
> =============                =====
> dgtable.c                   : 1324
> dgtable.h                   :  132
>
> TOTAL LINES:                : 1456
> AVG CHARS / LINE:           : 19.4
>
>
> I believe code that manipulates tables is generally a little more
> concise than code that manipulates trees (AVL trees in the case of
> dglib).  So that might account for some of the difference.  By the way,
> from a performance perspective I'm still curious as to why dglib stores
> its graph data in trees.  Granted AVL trees are balanced and therefore
> have a better worst-case performance than some other trees, but the
> access of vertices is deeply nested in graph algorithms and it seems
> that the added complexity of accessing one of V vertices in a tree
> (i.e., O(logV) as opposed to O(1) for a table) would result in
> unnecessarily sluggish performance.  On the other hand, the graphics
> hardware I've worked on have generally been strapped for speed.  It is
> interesting to note that dglib has some large macros, which I assume are
> an attempt to gain some speed by executing functions in-line.
>
> Another reason for the difference in library size is due to dglib
> allowing different data structure modes.  The dglib data structure can
> be either FLAT or TREE.  I believe the intent for the dglib FLAT data
> structure is primarily for serializing the TREE structure, which is the
> structure required by most of dglib's graph algorithms.  However I don't
> understand why its TREE data structure cannot be filled/emptied
> directly.
>
> And finally, probably no small part of the difference in library size is
> due to functionality that I overlooked in dglib and didn't implement in
> dgtable.
>
> I apologize for joining the game late.  Is the GRASS 5.1 vector
> development too far along to consider using tables instead of trees?

I cannot say anything about implementation details of dglib (cc to Roberto Micarelli), 
but I thing that performance was a primary goal and it makes dglib so large and complicated.
Did you make any benchmarks to compare dglib and dgtable on real data
and real tasks (attention to cache bug in dglib)?
For me, neither size of the code nor performance is the problem of dglib,
but complexity of code (from my point of view) which makes it difficult
to maintain and extend for occasional contributors.

All network modules in 5.1 are using wrappers for dglib:
Vect_net_build_graph()
Vect_net_shortest_path()
Vect_net_get_line_cost()
Vect_net_get_node_cost()
in grass51/lib/vector/Vlib/net.c
Never can be too late to try another graph library, because changes 
are necessary only in this file.

Radim





More information about the grass-dev mailing list