[SAC] Runaway cron job on Projects

Hamish hamish_b at yahoo.com
Thu Apr 24 22:27:51 PDT 2014


Alex wrote:

> So I think a cron job, possibly running under Martin L's user went awry
> on projects and filled 30% of the disk in hours, filling the disk
> completely.
> 
> Can someone more familiar with what these jobs are supposed to do,
> please kill the job, cleanup the excess files and figure out how to
> prevent future incidents.

It was a doc build cron job for grass6 addon modules which got stuck in a recursive loop re-adding the text of the page to a temp file over and over until the disk filled. The job is now killed, the massive .tmp.html file deleted, and apache restarted.

Hopefully the cause of the problem can be figured out before it happens again.
Right now the cron job is still enabled, if we (i.e. grass-dev) don't figure out the cause in <24 hrs I'd suggest to disable the cron job while we figure out a solution and an alternate place to run lower priority jobs, and the web server's cache can remain in memory longer to lessen the disk i/o issues.


> FYI, while we may have finished replacing disks and batteries it seems
> that too frequent building of docs is still taking a toll on disk i/o.
> We should discuss this further in the 2014 Infrastructure proposal.

I look forward to a dedicated build server, and knowing the load it will have to cope with, I wish it luck. :)


regards,
Hamish



More information about the Sac mailing list