[SAC] OSGeo Ganeti Cluster

Lance Albertson lance at osuosl.org
Thu Dec 14 12:45:55 PST 2017


Hi All,

I'm not sure if you got my original email back in July but I'm finally
ready to start scheduling this. I'd like to amend my plan below to the
following:

Summary:

   1. *Upgrade Ganeti from 2.6.2 to 2.15.2*
   2. Install CentOS 7 as the OS for all of the nodes
   3. Switch to managing said nodes to Chef instead of Cfengine

Here's the actual steps I plan to do:

   1. ​
   ​U
   pgrade Ganeti to 2.15.2 on the current cluster from 2.6.2

   2. Migrating high priority instances from plain to drbd using
   --no-wait-for-sync [1]
   3. ​Failover instances on osgeo3 to osgeo4

   4. Take osgeo3 down and reinstall it's OS with CentOS 7 and retain it's
   LVM data for VMs
   5. Readd osgeo3 back into the cluster using its previous configuration
   and start all the VMs back up
   6. Repeat the process of steps
   ​3​
   ​through ​
   ​5
    with osgeo4

​I'd like to go ahead with #1 and then schedule a time to do #2 after
that's completed.

Let me know!​

​[1] ​*The -t (--disk-template) option will change the disk template of the
instance. Currently only conversions between the plain and drbd disk
templates are supported, and the instance must be stopped before attempting
the conversion. When changing from the plain to the drbd disk template, a
new secondary node must be specified via the -n option. The option
--no-wait-for-sync can be used when converting to the drbd template in
order to make the instance available for startup before DRBD has finished
resyncing.*

On Thu, Jul 27, 2017 at 1:46 PM, Lance Albertson <lance at osuosl.org> wrote:

> ​​OSGeo Admins,
>
> I'd like to do several changes to your Ganeti cluster eventually to bring
> it up to a better supported platform and version of Ganeti as well.
> Unfortunately this is going to cause some downtime for each node but I'm
> pretty sure I can do it without losing data or downtime to certain VMs.
> Both of your nodes are currently running Gentoo which we haven't been
> maintaining other than for very important security issues that come up.
> Also, the version of Ganeti is currently 2.6.2 and the latest stable
> version is 2.15.2 which includes several improvements.
>
> The summary of items I'd like to do are:
>
>    1. Install CentOS 7 as the OS for all of the nodes
>    2. Switch to managing said nodes to Chef instead of Cfengine
>    3. Upgrade Ganeti from 2.6.2 to 2.15.2 (or whatever is stable at the
>    point we get to this)
>
> This is going to need to be a multi-stage process unfortunately, but I'm
> hoping I only have to do one down time per node. I've tested this process
> in a Vagrant environment and it seems to work.
>
> Here's the actual steps I plan to do:
>
>    1. Take osgeo3 down and reinstall it's OS with CentOS 7 and retain
>    it's LVM data for VMs
>    2. Install Ganeti 2.6.2 on osgeo3 using Chef so that the version stays
>    the same throughout the whole cluster
>    3. Readd osgeo3 back into the cluster using its previous configuration
>    and start all the VMs back up
>    4. Repeat the process of steps 1 through 3 with osgeo4
>    5. Upgrade Ganeti to 2.11.8 on all the nodes (I've found this to be
>    safer than jumping from 2.6.2 directly to 2.15 as they made some major
>    changes to the backend in those versions)
>    6. Finally upgrade Ganeti to 2.15.2 or whatever is latest stable at
>    the time.
>
> So my questions to you are:
>
>    1. Should any of the instances below be migrated to another node
>    during it's primary node downtime? If so and they're currently set to
>    plain, we can convert them to DRBD, it will just take a short downtime
>    (depending on how large the disk is) and move them over.
>    2. When could we start doing this? I was hoping to start within the
>    next month or so but it can certainly be adjusted.
>    3. How should we communicate in real-time if we need to? Via #osuosl
>    on IRC? Other means?
>
> *Instance* *Primary_node* *Status* *Memory* *DiskUsage* *Disk_template*
> adhoc.osgeo.osuosl.org osgeo4.osuosl.bak running 4096 65536 plain
> base.osgeo.osuosl.org osgeo3.osuosl.bak ADMIN_down - 4096 plain
> download.osgeo.osuosl.org osgeo3.osuosl.bak running 8192 158720 plain
> mail.osgeo.osuosl.org osgeo4.osuosl.bak running 4096 75776 plain
> projects.osgeo.osuosl.org osgeo4.osuosl.bak running 16384 208896 plain
> qgis.osgeo.osuosl.org osgeo4.osuosl.bak running 6144 167936 plain
> secure.osgeo.osuosl.org osgeo3.osuosl.bak running 4096 14464 drbd
> tracsvn2.osgeo.osuosl.org osgeo3.osuosl.bak ADMIN_down - 86016 plain
> tracsvn.osgeo.osuosl.org osgeo3.osuosl.bak running 8192 106496 plain
> web.osgeo.osuosl.org osgeo3.osuosl.bak running 4096 36864 plain
> webextra.osgeo.osuosl.org osgeo3.osuosl.bak running 4096 126976 plain
> wiki.osgeo.osuosl.org osgeo3.osuosl.bak running 4096 20480 plain
>
> ​Thanks-​
>
> --
> Lance Albertson
> Director
> Oregon State University | Open Source Lab
>



-- 
Lance Albertson
Director
Oregon State University | Open Source Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/sac/attachments/20171214/9a39cb43/attachment.html>


More information about the Sac mailing list