[SAC] Re: [OSGeo] #574: OSGeo SVN server(s) unresponsive during certain hours of the day

OSGeo trac_osgeo at osgeo.org
Thu May 27 23:34:48 EDT 2010


#574: OSGeo SVN server(s) unresponsive during certain hours of the day
---------------------+------------------------------------------------------
  Reporter:  jng     |       Owner:  sac at lists.osgeo.org
      Type:  defect  |      Status:  new                
  Priority:  normal  |   Component:  Systems Admin      
Resolution:          |    Keywords:                     
---------------------+------------------------------------------------------
Comment (by hamish):

 crschmidt wrote:
 > >   2. We don't know why.
 Martin S wrote:
 > BTW, I disagree on 2.).

 so....... what is it?
 do we have proof or just long term suspicion?
 LDAP on osgeo1 going haywire?

 after 24 hours of load monitoring on xblade13 and 14 (see plot image
 attached to this ticket):


 xblade13 (1.5GHz Athlon, 1GB RAM)
  - pretty quiet execpt for 7:49 thru 9:54 UTC when rsync is running full
 bore
    (timing wrt the 8 o'clock "dead hour" is rather suspicious...)

 xblade14 (1.5GHz Athlon, 1GB RAM)
  - periodic large httpd bursts (flushing queue after a stall?),
  - a number of mapserver jobs chewing up cpu,
  - buildbots running (pid 3798 gdal build stuck & consuming a lot of RAM?)
  - 24 hr avg load ~ 70% CPU utilization; min. 250mb RAM free


 full logs available on request or just collect some yourself -- improved
 load monitoring script:

 {{{
 #!/bin/sh

 # script to log cpu use etc.

 # log every 5 minutes
 interval=300

 outfile=~/"cpu_use.`hostname -s`.log"

 #echo "Will consume about $((50 * 3600/$interval * 24 / 1024)) kb/day"

 echo "#year/day hr:min TZ cpu_1min_avg cpu_5min_avg cpu_15min_avg cpu_hog
 hog_cpu% free_mem_mb" >> "$outfile"

 while [ 1 -eq 1 ] ; do
    unset TIMESTAMP CPU_USAGE CPU_HOG FREE_MEM
    TIMESTAMP=`date -u '+%Y/%j %k:%M UTC'`
    CPU_USAGE=`uptime | sed -e 's/^.*average://' -e 's/,//g' -e 's/^ //'`
    CPU_HOG=`top -b -n 1 | sed -e '1,7d' | head -n 1 | awk '{print $12 " "
 $9}'`
    FREE_MEM=`free -m | grep 'buffers/cache' | awk '{print $4}'`
    sleep 1
    echo "$TIMESTAMP $CPU_USAGE $CPU_HOG $FREE_MEM" >> "$outfile"
    sleep `expr $interval - 1`
 done
 }}}

 (plot)
 {{{
 file=cpu_use.xblade13
 cat $file.log | sed -e 's/^#.*//' | cut -f2 -d/ | \
   tr ':' ' ' | tr -s ' ' | cut -f1-3,5 -d' ' | awk \
   '{ if(/./) {printf("%f %s\n", $1 + $2/24 + $3/(24*60), $4)} else {print}
 }' \
   > $file.prn
 # ...

 ( cat << EOF
 set terminal svg size 800 480
 set output "cpuload.svg"
 set grid
 set xlabel 'Time (day of year, UTC)'
 set ylabel 'CPU load (1 minute average)'
 set title 'xblade laods, May 26 2010'
 set label "httpd" at 147.332, 6.5
 set label "rsync" at 147.278, 3.9
 set arrow from 146.8,1 to 148,1 nohead lt -1 lw 0.75

 plot "cpu_use.xblade14.prn" title 'xblade14' with lines lt 8, \
      "cpu_use.xblade13.prn" title 'xblade13' with lines lt 3
 EOF
 ) | gnuplot

 inkscape --file=cpuload.svg --export-png=cpuload.png -b white
 }}}


 regards,
 Hamish

-- 
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/574#comment:3>
OSGeo <http://www.osgeo.org/>
OSGeo committee and general foundation issue tracker.


More information about the Sac mailing list