[SAC] OSGeo7 Sever Config Quote

Alex M tech_dev at wildintellect.com
Tue Feb 13 08:57:14 PST 2018


Thanks for the feedback, some comments inline. - Alex

Quick Note, all mention of RAID is software not hardware.

On 02/13/2018 12:13 AM, Chris Giorgi wrote:
> Hi Alex,
> 
> Overall, this looks like a solid machine, but I do have a few suggestions
> considering the details of the hardware configuration.
> 
> -First, a RAID5 array for the spinning rust pool may leave the pool unduly
> susceptible to complete failure during recovery from a single drive failure
> and replacement due to the extreme load on all discs while recreating the
> data on the replaced disk tending to trigger a subsequent failure. Also, no
> hot spare is available, leaving the pool running in degraded mode until
> someone can physically swap the drive. A RAID6 (or ZFS RAIDZ2)
> configuration having two drives worth of recovery data greatly minimizes
> such risk.
> --Suggest that all 4 hot-swap bays be provisioned with the HGST models
> listed (512b emulated sector size) in the quote if the 4k native sector
> size drives are not available -- this can be worked around at the FS level
> (ashift=12) with minimal performance impact.
> 
We actually don't need that much space, 2 TB drives would have sufficed,
I picked the smallest size they offered which was 8 TB. What about just
going with mirror on 2x HGST drives. Note we do have a backup server.
Normally I would also use 4 Drives, but the machines just don't have
that many bays.

> -Second, RAID5 will seriously reduce the performance of the SSDs, and,
> especially on writes, increases latency, which somewhat defeats the purpose
> of utilizing SSDs. A simple mirror array would be much better performing
> and have the same level of redundancy, while a stripe could be much faster
> when used as a cache for hot data from the HDDs rather than the primary
> storage. For heavy write loads, such as databases, MLC SSDs really aren't
> suitable because of the wear rate and they usually lack of
> power-loss-protection. A smaller capacity but higher iops NVMe type SSD on
> the PCI bus would be much more effective for those workloads.
> --Suggest identifying workloads needing high-speed storage and determine
> read vs. write requirements before final selection of SSDs. Use two SATA
> SSDs in mirror or stripe configuration for bulk storage or cache. Consider
> PCIe connected NVMe if large number of writes and transactions.
> 

Speed isn't a huge issue, SSDs in any form seem to perform fast enough.
We tend to use SSD storage for everything. Having slow spinning disks at
all is new, and only suggested to start holding larger archives in a
publicly accessible way.

We could go to 4 drives and do a mirror 2x2. We want the ability to keep
going when a drive drops and get a new drive in within a couple of days.

Note OSGeo6 is 6xSSD with I believe 2 RAID5.

Can you verify the type of SSDs, there are other options - also make
sure to note these are not the consumer models.

> -Third, the memory really is the biggest bottleneck and resource limit, so
> I would favor increasing that as much as possible over the size of the SSD
> pool. Unused memory is used to cache filesystem contents to RAM, which is
> orders of magnitude faster than a SSD, but is there for your workloads when
> needed.
> --Suggest 128GB RAM, making trade-offs against SSD capacity if budget
> requires.
> 

If you look at OSGeo6 I'm not sure we're really utilizing all the ram we
bought.
http://webextra.osgeo.osuosl.org/munin/osgeo.org/osgeo6.osgeo.org/index.html

Though really in this case it's about $900 to add the additional ram up
to 128. I'm on the fence about this. Since I'd prefer to buy more often
cheaper machines than load up expensive ones.

> Some general comments on filesystems, software stack, and virtualization,
> in reverse order.
> 
> For most of the needs I have seen discussed, full virtualization is far
> more heavy-handed than necessary -- a container solution such as LXC/LXD
> would be much more appropriate and allow for much better granularity with
> lower overhead. A few VMs may be useful for particular projects that need
> to run their own kernel, low-level services, or suspend and move to another
> host for some reason, but those are the exception, not the rule. Many tools
> for managing VMs can also manage containers, and provisioning many
> containers off the same base template is both very easy and consumes very
> little additional disk space when used on a CoW filesystem (Copy-on-write)
> that supports cloning; additionally, backups are both instantaneous and
> only take up as much space as the changed files. My personal preference is
> to use ZFS for a filesystem because it supports all levels of the storage
> stack from disk to filesystem to snapshots and remote backup in a single
> tool and thus can detect and correct data corruption anywhere in the stack
> before it can be persisted. LVM2 and associated tools provide mostly
> similar functionality, but I find them much less intuitive and more
> difficult to administrate - that's certainly may be just a matter of
> personal taste and experience.
> 

We've already decided to go VM, so we can migrate existing services. In
our case administering a VM can be delegated easily. We do plan to try
our containers on OSGeo6 (existing). But for now we really just need to
move existing VMs from OSGeo3 so we can retire the hardware. These
include Downloads, Wiki, Trac/SVN, and Webextra (Foss4g)

> I hope this helps with the purchasing and provisioning decisions.
> 
> Take care,
>    ~~~Chris Giorgi~~~
> 
> 
> 
> 
> On Mon, Feb 12, 2018 at 1:27 PM, Regina Obe <lr at pcorp.us> wrote:
> 
>> Alex,
>>
>> This looks good to me +1.  Really excited to have a new Box in place.
>>
>> I'm also thinking that with the new box, we could start off-loading
>> osgeo3,4 and allow Lance to upgrade the ganeti on them.
>> Since we won't have anything mission critical -- after we migrate mission
>> critical stuff to osgeo7, if hardware on osgeo4 fails during upgrade, I
>> assume it wouldn't be a big deal.
>> As I recall, was it only osgeo4 that had a hardware issue?
>>
>> Thanks,
>> Regina
>>
>> -----Original Message-----
>> From: Sac [mailto:sac-bounces at lists.osgeo.org] On Behalf Of Alex M
>> Sent: Monday, February 12, 2018 3:54 PM
>> To: sac >> System Administration Committee Discussion/OSGeo <
>> sac at lists.osgeo.org>
>> Subject: [SAC] OSGeo7 Sever Config Quote
>>
>> Here's the latest quote for us to discuss server configuration for OSGeo7.
>>
>> https://drive.google.com/open?id=1X-z66jXXBUZuPqh6EP0d43g2NUCL7xcL
>>
>> The plan based on discussions is to manage KVM virtual machines, lvm
>> drives, with libvirt. At such time that we feel we need to go to something
>> more advanced because we are managing multiple physical machines we could
>> convert to ganeti or openstack (less sure about how to convert to
>> openstack).
>>
>> The idea was up to 4 virtual machines, each would have someone designated
>> to make sure it was updated, along with use of the unattended upgrades for
>> security patches.
>>
>> As quoted I've done RAID 5 SSD, and RAID 5 traditional, 3 drives each.
>> That will give us fast storage and large storage (think downloads and
>> foss4g archives).
>>
>> I did redundant power to maximize uptime.
>>
>> RAM is only 64 GB which is up to 16 for each of the Virtual Machines.
>>
>> Please discuss and ask questions so we can possibly vote this week at the
>> meeting.
>>
>> Thanks,
>> Alex
>> _______________________________________________
>> Sac mailing list
>> Sac at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/sac
>>
>> _______________________________________________
>> Sac mailing list
>> Sac at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/sac
>>
> 
> 
> 
> _______________________________________________
> Sac mailing list
> Sac at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/sac
> 



More information about the Sac mailing list