[GRASS-dev] GRASS GIS nightly builds

Newcomb, Doug doug_newcomb at fws.gov
Tue Feb 26 18:48:43 PST 2013


Ok folks,
I am a bit confused now. After setting OMP_NUM_THREADS=1 and exporting, I
get

 100%
v.surf.rst complete.

real 352m46.451s
user 341m14.196s
sys 2m16.477s

Over 100 minutes faster.  So the multiple cores get in each other's way...

Recompiling without OpenMP.....


Thanks!

Doug



On Mon, Feb 25, 2013 at 12:14 AM, Hamish <hamish_b at yahoo.com> wrote:

> Hi,
>
> to test the efficiency (does 650% of the CPU go 6.5x as fast as
> running 100% on a single core?) you can use the OMP_* environment
> variables.  from the bash command line:
>
>
> # try running it serially:
> OMP_NUM_THREADS=1
> export OMP_NUM_THREADS
> time g.module ...
>
>
> # let OpenMP set number of concurrent threads to number of local CPU cores
> unset OMP_NUM_THREADS
> time g.module ...
>
>
> then compare the overall & system time to complete.
> see http://grasswiki.osgeo.org/wiki/OpenMP#Run_time
>
> if that is horribly inefficient, it will probably be more
> efficient to run multiple (different) jobs serially, at the same
> time. The bash "wait" command is quite nice for that, waits
> for all backgrounded jobs to complete before going on.
>
> for r.in.{xyz|lidar|mb} this works quite well for generating
> multiple statistics at the same time, as the jobs will all want
> to read the same part of the input file at the about the same
> time, so it will still be fresh in the disk cache keeping I/O
> levels low.  (see the r3.in.xyz scripts)
>
>
> for v.surf.bspline my plan was to put each of the data subregions
> in their own thread; for v.surf.rst my plan was to put each of
> the quadtree squares into their own thread. Since each thread
> introduces a finite amount of time to create and destroy, the
> goal is to make fewer, longer running ones. Anything more than ~
> an order of mangnitude more that the number of cores you have is
> unneeded overhead.
>
> e.g., processing all satellite bands at the same time is a nice
> efficient win. If you process all 2000 rows of a raster map in
> 2000 just-an-instant-to-complete threads, the create/destroy
> overhead to thread survival time really takes its toll.
> Even as thread creation/destruction overheads become more
> efficiently handled by the OSs and compilers, the situation will
> still be the same. The interesting case is OpenCL, where your
> video card can run 500 GPU units..
>
>
> Hamish
>



-- 
Doug Newcomb
USFWS
Raleigh, NC
919-856-4520 ext. 14 doug_newcomb at fws.gov
---------------------------------------------------------------------------------------------------------
The opinions I express are my own and are not representative of the
official policy of the U.S.Fish and Wildlife Service or Dept. of the
Interior.   Life is too short for undocumented, proprietary data formats.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/grass-dev/attachments/20130226/25cc8d36/attachment.html>


More information about the grass-dev mailing list