[SAC] [Hosting] UNPLANNED: Ganeti hypervisor node reboot
Lance Albertson
lance at osuosl.org
Fri Aug 5 17:26:08 PDT 2022
We just had the same thing happen to another node (gprod1). All of the VMs
are back online now. List of affected VMs are the following:
- apereo1.osuosl.org
- app2.osuosl.org
- area51-1.phpbb.com
- buildroot-sources.osuosl.org
- deluge.osuosl.org
- jenkins-radish.osuosl.org
- ldap1.ntf.osuosl.org
- mandrivausers2.osuosl.org
- scripts.phpbb.com
- snowdrift-app.osuosl.org
- snowdrift-smtp.osuosl.org
On Sat, Jul 30, 2022 at 6:23 PM Lance Albertson <lance at osuosl.org> wrote:
> All,
>
> One of our production Ganeti nodes (gprod3) decided to reboot on its own
> for some reason. All of the VMs should be back online but you might double
> check that your services are running properly. There was a thundering herd
> problem after the node booted where most of the VMs decided to do a file
> system check and caused high I/O. I had a few VMs had problems with
> services such as polkit timing out causing other issues on the VMs. A
> reboot of those VMs seems to have fixed the issue.
>
> The list of affected VMs are the following:
>
> - chiral.oftc.net
> - civicrm.osm.osuosl.org
> - lf-bugs.osuosl.org
> - lf-lists.osuosl.org
> - mageiavm.osuosl.org
> - ntpsec-service3.osuosl.org
> - osu1php.osuosl.org
> - web3.osuosl.org
> - www1.phpbb.com
>
> If you have any other issues related to this, please send an email to
> support.
>
> Thanks-
>
> --
> Lance Albertson
> Director
> Oregon State University | Open Source Lab
>
--
Lance Albertson
Director
Oregon State University | Open Source Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/sac/attachments/20220805/4a14db31/attachment.htm>
-------------- next part --------------
_______________________________________________
Hosting mailing list
Hosting at osuosl.org
https://lists.osuosl.org/mailman/listinfo/hosting
More information about the Sac
mailing list