[SAC] New Hardware, can we purchase now

Alex M tech_dev at wildintellect.com
Wed Mar 14 09:05:28 PDT 2018


My overall response, I'm a little hesitant to implement so many new
technologies at the same time with only 1 person who knows them (Chris G).

My opinion
+1 on some use of ZFS, if we have a good guide
-1 on use of Funtoo, We've prefered Debian or Ubuntu for many years and
have more people comfortable with them.
+1 on trying LXD
+1 on Optane
?0 on the SSD caching

1. What tool are we using to configure write-caching on the SSDs? I'd
rather not be making an overly complicated database configuration.

2. That seems a reasonable answer to me, though do we still need the
SSDs if we use the Optane for caching? It sounds to me like Optane or
SSD would suffice.

3. Disks -  Yes if we plan to archive OSGeo Live that would benefit from
larger disks. I'm a -1 on storing data for the geodata committee, unless
they can find large data that is not publicly hosted elsewhere. At which
point I would recommend we find partners to host the data like GeoForAll
members or companies like Amazon/Google etc... Keep in mind we also need
to plan for backup space. Note, I don't see the total usable disk size
of backup in the wiki, can someone figure that out and add it. We need
to update https://wiki.osgeo.org/wiki/SAC:Backups

New question, which disk are we installing the OS on, and therefore the
ZFS packages?

Thanks,
Alex

On 03/13/2018 12:57 PM, Chris Giorgi wrote:
>  Hi Alex,
> Answers inline below:
> Take care,
>    ~~~Chris~~~
> 
> On Mon, Mar 12, 2018 at 10:41 AM, Alex M <tech_dev at wildintellect.com> wrote:
>> On 03/02/2018 12:25 PM, Regina Obe wrote:
>>> I'm in IRC meeting with Chris and he recalls the only outstanding thing
>>> before hardware purchase was the disk size
>>>
>>> [15:17] <TemptorSent> From my reply to the mailing list a while back, the
>>> pricing for larger drives: (+$212 for 4x10he or +$540 for 4x12he)
>>>  [15:19] <TemptorSent> That gives us practical double-redundant storage of
>>> 12-16TB and 16-20TB respectively, depending how we use it.
>>>
>>>
>>> If that is all, can we just get the bigger disk and move forward with the
>>> hardware purchase.  Unless of course the purchase has already been made.
>>>
>>>
>>> Thanks,
>>> Regina
>>>
>>
>> Apologies, I dropped the ball on many things while traveling for work...
>>
>> My take on this, I was unclear on if we really understood how we would
>> utilize the hardware for the needs, since there are a few new
>> technologies in discussion we haven't used before. Was also in favor of
>> small savings as we're over the line item, and that money could be used
>> for things like people hours or 3rd party hosting, spare parts, etc...
>>
>> So a few questions:
>> 1. If we get the optane card, do we really need the SSDs? What would we
>> put on the SSDs that would benefit from it, considering the Optane card?
> 
> The Optane is intended for caching frequently read data on very fast storage.
> As a single unmirrored device, it is not recommended for write-caching of
> important data, but will serve quite well for temporary scratch space.
> 
> Mirrored SSDs are required for write caching to prevent failure of a single
> device causing data loss. The size of the write cache is very small by
> comparison to the read cache, but the write-to-read ratio is much higher,
> necessitating the larger total DWPD*size rating. The SSDs can also provide
> the fast tablespace for databases as needed, which also have high write-
> amplification. The total allocated space should probably be 40-60% of the
> device size to ensure long-term endurance. The data stored on the SSDs
> can be automatically backed up to the spinning rust on a regular basis for
> improved redundancy.
> 
>> 2. What caching tool will we use with the Optane? Something like
>> fscache/CacheFS that just does everything accessed, or something
>> configured per site like varnish/memcache etc?
> 
> We can do both if desirable, allocating large cache for the fs (L2ARC in ZFS),
> as well as providing an explicit cache where desirable. This configuration can
> be modified at any time, as the system's operation is not dependent on the
> caching device being active.
> 
>> 3. Our storage growth is modest, not that I don't consider the quoted 8
>> or 10 TB to be reliable, but the 2 and 4 TB models have a lot more
>> reliability data, and take significantly less time to rebuild in a Raid
>> configuration. So how much storage do we really need for Downloads and
>> Foss4g archives?
> 
> OSGeo-Live alone has a growth rate and retention policy that indicates needs for
> on the order of 100GB-1TB over the next 5 years from my quick calculations, not
> including any additional large datasets. Supporting the geodata project would
> likely consume every bit of storage we throw at it and still be
> thirsty for more in
> short order, so I would propose serving only the warm data on the new server and
> re-purposing one of the older machines for bulk cold storage and backups once
> services have been migrated successfully.
> 
> Remember, the usable capacity will approximately equal the total capacity of a
> single drive in a doubly redundant configuration with 4 drives  at
> proper filesystem
> fill ratios. We'll gain some due to compression, but also want to provision for
> snapshots and backup of the SSD based storage, so 1x single drive size is a
> good SWAG. Resliver times for ZFS are based on actual stored data, not disk
> size, and can be done online with minimal degradation of service, so that's a
> moot point I believe.
> 
>> 4. Do we know what we plan to put on the SSD drives vs the Spinning Disks?
> 
> See (1).
> 
>> I think with the answers to these we'll be able to vote this week and order.
>>
>> Thanks,
>> Alex
>> _______________________________________________
>> Sac mailing list
>> Sac at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/sac



More information about the Sac mailing list