[GRASS-user] r.neighbors velocity
Hamish
hamish_b at yahoo.com
Sat Jun 29 04:26:14 PDT 2013
Markus Metz wrote:
> Some more results with Sören's test program on a Intel(R) Core(TM) i5
> CPU M450 @ 2.40GHz (2 real cores, 4 fake cores) with gcc 4.7.2 and
> clang 3.3
>
> gcc -O3
> v is 2.09131e+13
>
> real 2m0.393s
> user 1m57.610s
> sys 0m0.003s
>
> gcc -Ofast
> v is 2.09131e+13
>
> real 0m7.218s
> user 0m7.018s
> sys 0m0.017s
nice. one thing we need to remember though is that it's not entirely
free, one thing -Ofast turns on is -ffast-math,
"""
This option is not turned on by any -O option besides -Ofast since it can
result in incorrect output for programs that depend on an exact
implementation of IEEE or ISO rules/specifications for math functions. It
may, however, yield faster code for programs that do not require the
guarantees of these specifications.
"""
which may not be fit for our purposes.
With the ifort compiler there is '-fp-model precise' which allows only
optimizations which don't harm the results. Maybe gcc has something
similar.
Glad to see -floop-parallelize-all in gcc 4.7, it will help us identify
places to focus OpenMP work on.
Hamish
More information about the grass-user
mailing list