[SAC] Unable to SSH into buildbot.osgeo.org

Frank Warmerdam warmerdam at pobox.com
Wed Feb 24 11:42:28 EST 2010


Jeff McKenna wrote:
> This issue happens everyday from around 2pm to 5am EST.  Someone in 
> #osgeo reported this same issue today at the same time (troubles logging 
> into xblade14, filing a trac ticket, committing to svn).  I think SAC 
> should take a look into this.

Jeff,

On osgeo1 (svn/trac) there are backup scripts that run in roughly the
2-5am EST period and it has been my observation that these often lead
to IO saturation and we start getting a backup in http services that
results in service unavailability.

I don't know how to fix it short of migrating service(s) to a new home.
We are doing that.

So I think the key is to get svn/trac migrated to OSL as soon as practical
without putting undue amounts of time into babying osgeo1.

I'm less sure why xblade14 has bad periods.  It might be that several
buildbot runs are launched at the same time.  We also have a plan to migrate
buildbot slaves off xblade14 that would help address that particular issue
if it is the cause, but such migration is going quite slowly.   It would be
helpful if you could login to the server in advance of the problem period
and monitor top to see what results in the server overload.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | Geospatial Programmer for Rent



More information about the Sac mailing list