[GRASS-dev] vector library changes

Markus Neteler neteler at osgeo.org
Mon Mar 23 07:59:25 EDT 2009


Hi Markus,

On Mon, Mar 23, 2009 at 11:37 AM, Markus Metz
<markus.metz.giswork at googlemail.com> wrote:
> Hi all,
>
> I have tried to make topology building in grass7 a bit faster, with limited
> success. Some functions are now a bit faster, but there are no drastic
> changes. Some other functions are now a bit slower, and there I would like
> to know if there are objections against these changes.
>
> The first change causing a little slow down is in diglib: memory of no
> longer used structures is freed. That was mentioned as a TODO in the source
> code, probably by Radim. All I can see now is that cleaning time, e.g.
> v.in.ogr of a polygon vector, increases from e.g. 15m30s to 15m35s, IOW
> speed loss is in this case about 0.5%, but memory consumption is not really
> lower, even when cleaning large vectors with many areas (> 50,000) where
> many structures should be freed. The new functions dig_free_node(),
> dig_free_line(), dig_free_area(), and dig_free_isle() in
> diglib/struct_alloc.c are called from within plus_line.c and plus_area.c.
> Maybe someone can have a look at the new functions in struct_alloc.c to
> check if I made a mistake? AFAIKT, there is no obvious mistake, resulting
> vectors are identical in my cleaning tests, no warnings or errors.
>
> Another similar slow down (about 0.5%) is caused by G_percent which I added
> to all cleaning functions. The reasoning is that users may wonder if
> anything is happening at all when importing a large polygon vector with
> v.in.ogr or cleaning a large vector, and G_percent shows that there is
> something happening. I find that reassuring to know.

Yes, that's very useful (and 0.5% are an acceptable tradeoff).

> In short, grass7 is still about as fast as grass6 with regard to cleaning
> vectors (should be a bit faster with building topology, only noticeable with
> really large vectors), but gives a bit more feedback on the progress. The
> vector API as well as the vector format is unchanged.

Here my tests:

GRASS 6.5.svn:
time v.in.ogr usr_urb.shp out=tmp
...
78095 input polygons
Total area: 6.267569e+09 (78095 areas)
Overlapping area: 0.000000e+00 (0 areas)
Area without category: 0.000000e+00 (0 areas)
4020.05user 76.72system 1:14:33elapsed 91%CPU (0avgtext+0avgdata 0maxresident)k
298216inputs+762008outputs (26major+213960minor)pagefaults 0swaps


GRASS 7.svn:
time v.in.ogr usr_urb.shp out=tmp
...
78095 input polygons
Total area: 6.267569e+09 (78095 areas)
Overlapping area: 0.000000e+00 (0 areas)
Area without category: 0.000000e+00 (0 areas)
4137.73user 59.59system 1:15:52elapsed 92%CPU (0avgtext+0avgdata 0maxresident)k
1080inputs+793648outputs (4major+219703minor)pagefaults 0swaps

Best
Markus

PS: Could it output the percentage in the same line to reduce vertical
  space (in a "screen" session I didn't manage to scroll back)?
  Mhh, perhaps it is G_percent() which introduces the newline...

now:
...
-----------------------------------------------------
Remove duplicates:
 100%
-----------------------------------------------------
Clean boundaries at nodes:
 100%
...

ideally:
...
-----------------------------------------------------
Remove duplicates: 100%
-----------------------------------------------------
Clean boundaries at nodes: 100%
...


More information about the grass-dev mailing list