[SAC] Re: [OSGeo] #574: OSGeo SVN server(s) unresponsive during
certain hours of the day
OSGeo
trac_osgeo at osgeo.org
Wed May 26 18:17:50 EDT 2010
#574: OSGeo SVN server(s) unresponsive during certain hours of the day
---------------------+------------------------------------------------------
Reporter: jng | Owner: sac at lists.osgeo.org
Type: defect | Status: new
Priority: normal | Component: Systems Admin
Resolution: | Keywords:
---------------------+------------------------------------------------------
Comment (by hamish):
Replying to [comment:1 crschmidt]:
> 2. We don't know why.
I don't like that situation, and so I wrote & have started running this
script on xblade13 and 14 as a test. Will post a plot of the results after
some time. If you think it is useful feel free to run it on the more
strained servers.
log_cpu.sh:
{{{
#!/bin/sh
# script to log cpu use etc.
# log every 5 minutes
interval=300
outfile=~/"cpu_use.`hostname`.log"
#echo "Will consume about $((50 * 3600/300 * 24 / 1024)) kb/day"
echo "#year/day hr:min TZ cpu_1min_avg cpu_5min_avg cpu_15min_avg cpu_hog
hog_cpu% free_mem_mb" >> "$outfile"
while [ 1 -eq 1 ] ; do
unset TIMESTAMP CPU_USAGE CPU_HOG FREE_MEM
TIMESTAMP=`date -u '+%Y/%j %k:%M UTC'`
CPU_USAGE=`uptime | cut -f5 -d: | sed -e 's/,//g' -e 's/^ //'`
CPU_HOG=`top -b -n 1 | sed -e '1,7d' | head -n 1 | awk '{print $12 " "
$9}'`
FREE_MEM=`free -m | grep 'buffers/cache' | awk '{print $4}'`
sleep 1
echo "$TIMESTAMP $CPU_USAGE $CPU_HOG $FREE_MEM" >> "$outfile"
sleep `expr $interval - 1`
done
}}}
example output from this morning's xblade13
{{{
#year/day hr:min TZ cpu_1min_avg cpu_5min_avg cpu_15min_avg cpu_hog
hog_cpu% free_mem_mb
2010/146 20:21 UTC 0.00 0.03 0.00 rhn-applet-gui 2.0 894
2010/146 20:26 UTC 0.01 0.02 0.00 rhn-applet-gui 2.0 896
2010/146 20:31 UTC 0.00 0.00 0.00 init 0.0 896
2010/146 20:36 UTC 0.25 0.10 0.04 top 2.0 896
2010/146 20:41 UTC 0.06 0.06 0.02 top 2.0 898
2010/146 20:46 UTC Xvnc 1.9 898
2010/146 20:51 UTC top 2.0 899
2010/146 20:56 UTC top 3.9 898
2010/146 21:01 UTC top 1.9 896
2010/146 21:06 UTC init 0.0 896
2010/146 21:11 UTC init 0.0 897
2010/146 21:16 UTC httpd 1.9 897
2010/146 21:21 UTC top 3.9 897
2010/146 21:26 UTC top 1.9 900
2010/146 21:31 UTC top 1.9 897
2010/146 21:36 UTC top 3.9 898
2010/146 21:41 UTC top 2.0 898
2010/146 21:46 UTC 0.01 0.03 0.00 top 1.9 898
2010/146 21:51 UTC 0.05 0.03 0.00 init 0.0 900
2010/146 21:56 UTC 0.20 0.11 0.02 init 0.0 901
2010/146 22:01 UTC 0.14 0.09 0.03 httpd 2.0 900
2010/146 22:06 UTC 0.00 0.04 0.01 top 2.0 901
2010/146 22:11 UTC 0.17 0.11 0.03 nscd 2.0 900
}}}
hmmmm, that's weird, some `uptime` parsing bug..? may have to replace `cut
-f5` with `sed -e 's/.*average://'`.
personally I just trained myself not to be on the computer from 7-10pm
local time as a work around for this issue :)
regards,
Hamish
--
Ticket URL: <https://trac.osgeo.org/osgeo/ticket/574#comment:2>
OSGeo <http://www.osgeo.org/>
OSGeo committee and general foundation issue tracker.
More information about the Sac
mailing list