[SAC] New Hardware, can we purchase now

Alex M tech_dev at wildintellect.com
Fri Apr 13 08:39:54 PDT 2018

Chris and Harrison,

Can you confirm that this quote is acceptable and we should move on to



On 04/02/2018 10:59 AM, Alex M wrote:
> To clarify I was pondering 2 devices not 3. The answer may be you want
> the 3 we've already selected so the read cache is separate and larger.
> Please let me know if there are any other issues with the config before
> we proceed.
> Thanks,
> Alex
> On 03/30/2018 02:54 PM, Chris Giorgi wrote:
>> I'm not sure how we would go about fitting a third Optane device --
>> the quote had HHHL PCIe cards listed, not the required U.2 devices
>> which go in place of the micron sata ssds.
>> The PCIe -> U.2 interface card provides 4 PCIe 3.0 lanes to each U.2
>> interface, which then connects by cable to the drives themselves.
>> The M.2 card slot on the board should be on it's own set of lanes, as
>> none of the remaining PCIe slots on the board are occupied due to
>> space constraints.
>> The reason for using the more expensive (and faster) Optanes for the
>> write cache is that a write-cache failure can lead to data corruption,
>> and they have an order of magnitude more write endurance than a
>> standard SSD.
>> The read cache can use a larger, cheaper (but still fast) SSD because
>> it see much lower write-amplification than the write cache and a
>> failure won't cause corruption.
>>    ~~~Chris~~~
>> On Fri, Mar 30, 2018 at 11:53 AM,  <harrison.grundy at astrodoggroup.com> wrote:
>>> Can someone confirm that the 4x PCIe slots aren't shared with the M.2 slot on the board and that 2 independent 4x slots are available?
>>> If all 3 (SSD, Optanes) are on a single 4x bus, it kinda defeats the purpose.
>>> Harrison
>>> Sent via the BlackBerry Hub for Android
>>>   Original Message
>>> From: tech_dev at wildintellect.com
>>> Sent: March 31, 2018 02:21
>>> To: sac at lists.osgeo.org
>>> Reply-to: tech at wildintellect.com; sac at lists.osgeo.org
>>> Cc: chrisgiorgi at gmail.com
>>> Subject: Re: [SAC] New Hardware, can we purchase now
>>> Here's the latest quote with the modifications Chris suggested.
>>> One question, any reason we can't just use the Optanes for both read &
>>> write caches?
>>> Otherwise unless there are other suggestions or clarifications, I will
>>> send out another thread for an official vote to approve. Note the price
>>> is +$1,000 more than originally budgeted.
>>> Thanks,
>>> Alex
>>> On 03/14/2018 09:47 PM, Chris Giorgi wrote:
>>>> Further investigation into the chassis shows this is the base sm is using:
>>>> https://www.supermicro.com/products/system/1U/6019/SYS-6019P-MT.cfm
>>>> It has a full-height PCIe 3.0 x8 port, as well as a M2 PCIe 3.0 x4
>>>> slot on the motherboard.
>>>> In light of this, I am changing my recommendation to the following,
>>>> please follow-up with sm for pricing:
>>>> 2ea. Intel Optane 900p 280GB PCIe 3.0 x4 with U.2 interfaces,
>>>> replacing SATA SSDs
>>>> ..connected to either a SuperMicro AOC-SLG3-2E4R or AOC-SLG3-2E4R
>>>> (Depending on compatibility)
>>>> Then, a single M.2 SSD such as a 512GB Samsung 960 PRO in the motherboard slot.
>>>> With this configuration, the Optanes supply a very fast mirrored write
>>>> cache (ZFS ZIL SLOG), while the M.2 card provides read caching (ZFS
>>>> L2ARC), and no further cache configuration needed.
>>>> Let me know if that sound more palatable.
>>>>    ~~~Chris~~~
>>>> On Wed, Mar 14, 2018 at 10:36 AM, Chris Giorgi <chrisgiorgi at gmail.com> wrote:
>>>>> Alex,
>>>>> Simply put, write caching requires redundant devices; read caching does not.
>>>>> The write cache can be relatively small -- it only needs to handle
>>>>> writes which have not yet been committed to disks. This allows sync
>>>>> writes to finish as soon as the data hits the SSD, with the write to
>>>>> disk being done async. Failure of the write cache device(s) may result
>>>>> in data loss and corruption, so  they MUST be redundant for
>>>>> reliability.
>>>>> The read cache should be large enough to handle all hot and much warm
>>>>> data. It provides a second level cache to the in-memory block cache,
>>>>> so that cache-misses to evicted blocks can be serviced very quickly
>>>>> without waiting for drives to seek. Failure of the read cache device
>>>>> degrades performance, but has no impact on data integrity.
>>>>>   ~~~Chris~~~
>>>>> On Wed, Mar 14, 2018 at 9:05 AM, Alex M <tech_dev at wildintellect.com> wrote:
>>>>>> My overall response, I'm a little hesitant to implement so many new
>>>>>> technologies at the same time with only 1 person who knows them (Chris G).
>>>>>> My opinion
>>>>>> +1 on some use of ZFS, if we have a good guide
>>>>>> -1 on use of Funtoo, We've prefered Debian or Ubuntu for many years and
>>>>>> have more people comfortable with them.
>>>>>> +1 on trying LXD
>>>>>> +1 on Optane
>>>>>> ?0 on the SSD caching
>>>>>> 1. What tool are we using to configure write-caching on the SSDs? I'd
>>>>>> rather not be making an overly complicated database configuration.
>>>>>> 2. That seems a reasonable answer to me, though do we still need the
>>>>>> SSDs if we use the Optane for caching? It sounds to me like Optane or
>>>>>> SSD would suffice.
>>>>>> 3. Disks -  Yes if we plan to archive OSGeo Live that would benefit from
>>>>>> larger disks. I'm a -1 on storing data for the geodata committee, unless
>>>>>> they can find large data that is not publicly hosted elsewhere. At which
>>>>>> point I would recommend we find partners to host the data like GeoForAll
>>>>>> members or companies like Amazon/Google etc... Keep in mind we also need
>>>>>> to plan for backup space. Note, I don't see the total usable disk size
>>>>>> of backup in the wiki, can someone figure that out and add it. We need
>>>>>> to update https://wiki.osgeo.org/wiki/SAC:Backups
>>>>>> New question, which disk are we installing the OS on, and therefore the
>>>>>> ZFS packages?
>>>>>> Thanks,
>>>>>> Alex
>>>>>> On 03/13/2018 12:57 PM, Chris Giorgi wrote:
>>>>>>>  Hi Alex,
>>>>>>> Answers inline below:
>>>>>>> Take care,
>>>>>>>    ~~~Chris~~~
>>>>>>> On Mon, Mar 12, 2018 at 10:41 AM, Alex M <tech_dev at wildintellect.com> wrote:
>>>>>>>> On 03/02/2018 12:25 PM, Regina Obe wrote:
>>>>>>>>> I'm in IRC meeting with Chris and he recalls the only outstanding thing
>>>>>>>>> before hardware purchase was the disk size
>>>>>>>>> [15:17] <TemptorSent> From my reply to the mailing list a while back, the
>>>>>>>>> pricing for larger drives: (+$212 for 4x10he or +$540 for 4x12he)
>>>>>>>>>  [15:19] <TemptorSent> That gives us practical double-redundant storage of
>>>>>>>>> 12-16TB and 16-20TB respectively, depending how we use it.
>>>>>>>>> If that is all, can we just get the bigger disk and move forward with the
>>>>>>>>> hardware purchase.  Unless of course the purchase has already been made.
>>>>>>>>> Thanks,
>>>>>>>>> Regina
>>>>>>>> Apologies, I dropped the ball on many things while traveling for work...
>>>>>>>> My take on this, I was unclear on if we really understood how we would
>>>>>>>> utilize the hardware for the needs, since there are a few new
>>>>>>>> technologies in discussion we haven't used before. Was also in favor of
>>>>>>>> small savings as we're over the line item, and that money could be used
>>>>>>>> for things like people hours or 3rd party hosting, spare parts, etc...
>>>>>>>> So a few questions:
>>>>>>>> 1. If we get the optane card, do we really need the SSDs? What would we
>>>>>>>> put on the SSDs that would benefit from it, considering the Optane card?
>>>>>>> The Optane is intended for caching frequently read data on very fast storage.
>>>>>>> As a single unmirrored device, it is not recommended for write-caching of
>>>>>>> important data, but will serve quite well for temporary scratch space.
>>>>>>> Mirrored SSDs are required for write caching to prevent failure of a single
>>>>>>> device causing data loss. The size of the write cache is very small by
>>>>>>> comparison to the read cache, but the write-to-read ratio is much higher,
>>>>>>> necessitating the larger total DWPD*size rating. The SSDs can also provide
>>>>>>> the fast tablespace for databases as needed, which also have high write-
>>>>>>> amplification. The total allocated space should probably be 40-60% of the
>>>>>>> device size to ensure long-term endurance. The data stored on the SSDs
>>>>>>> can be automatically backed up to the spinning rust on a regular basis for
>>>>>>> improved redundancy.
>>>>>>>> 2. What caching tool will we use with the Optane? Something like
>>>>>>>> fscache/CacheFS that just does everything accessed, or something
>>>>>>>> configured per site like varnish/memcache etc?
>>>>>>> We can do both if desirable, allocating large cache for the fs (L2ARC in ZFS),
>>>>>>> as well as providing an explicit cache where desirable. This configuration can
>>>>>>> be modified at any time, as the system's operation is not dependent on the
>>>>>>> caching device being active.
>>>>>>>> 3. Our storage growth is modest, not that I don't consider the quoted 8
>>>>>>>> or 10 TB to be reliable, but the 2 and 4 TB models have a lot more
>>>>>>>> reliability data, and take significantly less time to rebuild in a Raid
>>>>>>>> configuration. So how much storage do we really need for Downloads and
>>>>>>>> Foss4g archives?
>>>>>>> OSGeo-Live alone has a growth rate and retention policy that indicates needs for
>>>>>>> on the order of 100GB-1TB over the next 5 years from my quick calculations, not
>>>>>>> including any additional large datasets. Supporting the geodata project would
>>>>>>> likely consume every bit of storage we throw at it and still be
>>>>>>> thirsty for more in
>>>>>>> short order, so I would propose serving only the warm data on the new server and
>>>>>>> re-purposing one of the older machines for bulk cold storage and backups once
>>>>>>> services have been migrated successfully.
>>>>>>> Remember, the usable capacity will approximately equal the total capacity of a
>>>>>>> single drive in a doubly redundant configuration with 4 drives  at
>>>>>>> proper filesystem
>>>>>>> fill ratios. We'll gain some due to compression, but also want to provision for
>>>>>>> snapshots and backup of the SSD based storage, so 1x single drive size is a
>>>>>>> good SWAG. Resliver times for ZFS are based on actual stored data, not disk
>>>>>>> size, and can be done online with minimal degradation of service, so that's a
>>>>>>> moot point I believe.
>>>>>>>> 4. Do we know what we plan to put on the SSD drives vs the Spinning Disks?
>>>>>>> See (1).
>>>>>>>> I think with the answers to these we'll be able to vote this week and order.
>>>>>>>> Thanks,
>>>>>>>> Alex
>>>>>>>> _______________________________________________
>>>>>>>> Sac mailing list
>>>>>>>> Sac at lists.osgeo.org
>>>>>>>> https://lists.osgeo.org/mailman/listinfo/sac
>>> _______________________________________________
>>> Sac mailing list
>>> Sac at lists.osgeo.org
>>> https://lists.osgeo.org/mailman/listinfo/sac
> _______________________________________________
> Sac mailing list
> Sac at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/sac

More information about the Sac mailing list