[SAC] DRBD on the new servers or not?

Sat Mar 27 17:11:09 EDT 2010

Alex Mandel wrote:
> DRBD is software that basically behaves like a raid 1 (mirror) on a
> virtual machine. The advantage I've been told is failover redundancy and
> the ability to apply updates without downtime(you apply to the mirror
> and then swap). The disadvantages are that it takes exactly double the
> disk space, and potentially leads to slower I/O on the disk because it
> has to wait until it finishes writing to 2 places (possibly on 2 machines).
> Here's the info on the software:http://www.drbd.org/
> 
> OSL has offered this feature on a per vm basis meaning we can choose to
> have it on or not for each particular vm.
> 
> For Backup the decision was easy to say no. But for the Other VMs the
> decision is a little trickier as the HD sizes are small and 100% uptime
> might be good for some services.
> 
> I'll let Martin who seems to have used it before comment some more.
> 
> Unless I'm told otherwise here's the current plan:
> http://wiki.osgeo.org/wiki/Infrastructure_Transition_Plan_2010#Final_Plan
> 
> We need to finish installing the Base image and make a decision about
> DRBD for Wiki and Secure before monday so OSL can go ahead and create
> those VMs for us. In the future we will have access and instructions to
> build VMs ourselves, the first batch is being done by OSL so they can
> test the instructions.
> 
> Thanks,
> Alex

Reading a little more about it. I would say that a normal DRBD is not in
our best interest specifically because we did not buy identical servers
for an HA(high availability) cluster. Looking at the docs though we
could do
"To asynchronous

The other option is asynchronous mirroring. That means that the entity
that issued the write requests is informed about completion as soon as
the data is written to the local disk.

Asynchronous mirroring is necessary to build mirrors over long
distances, i.e., the interconnecting network's round trip time is higher
than the write latency you can tolerate for your application. (Note: The
amount of data the peer node may fall behind is limited by
bandwidth-delay product and the TCP send buffer.)"

Which I think would remove the issue of DRBD causing delays form I/O
waiting on the mirror. If that sounds good to people then I will ask OSL
specifically for this type of DRBD. Also note that it looks like the
mirror is on the other server, so for VMs on osgeo4 we have to think
extra hard to make sure they won't add significant load on osgeo3 with
DRBD, and we might consider placing anything we feel needs DRBD on
osgeo3 to start with.

Thanks,
Alex