Mirror drives, or whole servers instead?

  • ATTENTION! As of November 1, 2020, you are not able to reply to threads 6 months after the thread is opened if there are more than 500 posts in the thread.
    Threads will not be locked, so posts may still be edited by their authors.
    Just start a new thread on the topic to post if you get an error message when trying to reply to a thread.

Twice_Shy

Occasional Visitor
Reading this article has had me thinking - http://jrs-s.net/2015/02/06/zfs-you-should-use-mirror-vdevs-not-raidz/ - which basically says RAID1 (equivalence) is better than RAID5, and I fully understand the argument. It's about tradeoffs of storage efficiency vs failure tolerance.

What I found myself specifically wondering though is does it make more sense to mirror drives (or vdevs) or to simply mirror entire servers instead with some kind of failover in place?

If i'm going to have 16 drives lets say, should I throw 8 drives in two separate towers and sync the towers, or should I throw them all in one with a single RAID controller letting that do the managing? What are the pro's and con's each way?

Some are obvious to me - 2 servers will take a bit more power, but have a bit more protection under many incidents - an exploding PSU can wipe out every drive in your system, that shouldn't happen with a good PSU, but the point is it contains failure better. Also more drives in one system tends to demand more expensive equipment - cheaper 'commodity' clusters could have financial advantages since the parity calculations and such demands more of the cpu side to work at larger sizes (paying less for disks matters less if I need 128gigs ram instead of 2gigs ram on some simple mirrored cluster) maybe for diminishing returns.

From either the software support side, or just the general 'engineering' side, are there strong reasons not so obvious to go one way or another?

Assume backups ARE being properly done and that RAID 1/5 are not a replacement for backups. Just fault tolerance to insure one dead disk doesn't stop the day's work or cause immediate loss of the day's data.
 

sfx2000

Part of the Furniture
I always back up filesystems, not devices... then whether it is RAID XX or some kind of file system... the data is more important than how it's smeared across things...

Then you get into statefull vs. stateless things - what is easy to rebuild vs. configs and user data...

You seem to be a bit scattered - have you sat down and defined the requirements for your project?
 

Twice_Shy

Occasional Visitor
I always back up filesystems, not devices... then whether it is RAID XX or some kind of file system... the data is more important than how it's smeared across things...

Then you get into statefull vs. stateless things - what is easy to rebuild vs. configs and user data...

You seem to be a bit scattered - have you sat down and defined the requirements for your project?

Scattered research makes my mind jump around I guess. :) I have a series of questions but don't just want to spam the board with them. I get ideas in my head then either try to refine them or look at alternatives/are even better ideas than what I think makes sense out there? So over the months i'll be slowly reading more, and hammering out my decisions on major things first whenever I think I have enough information to choose. But it's always possible to still change those decisions if new information I didn't have before comes my way until I commit to buying stuff. I try to list in every post "what I think I will do and why" hoping people will shoot holes in my idea if there are flaws.

Short versions of what i'm thinking so far:

#1 My biggest requirement is a production workflow where most things ultimately end up on LTO Ultrium tape.

It is acceptable, if a bit tedious, to save the data in place from a NAS, then load a whole different workload of data from different tapes to be played with for awhile. It's more important that at least some work be able to continue (mostly the initial acquisition of lots of 8k video) while people are in film school with access to tools and we are 'punching above our weight' trying to do this at all. We need the raw footage in the next few years or its wasted/no chance to get later. We may spend a few years after processing and working on it all after. By then hard drive prices will hopefully have dropped a bit more but at least we will have the lossless digital footage to work on Stage 2 of the project. It's done when it's done even if that's 12 years.

#2 Everything has to scale up intelligently. Same system works at 32TB online as for 96TB online as for 288TB online, even if thats literal hardware clones of 1x box to 3x boxes to 9x boxes. Doesn't have to be all in one box. Would be nice if performance can scale the same with SSD's and bigger raid stripes, from saturating 1gig networking to later saturating 4gig or even 10gig.

#3 My Step One is pretty well defined - I build a NAS before anything else, and tinker and experiment. I have the time and can afford to do this, along with some experiments with network-boot workstations and similar after the NAS itself works reliably. The exact plan for the NAS is sorta free floating until I learn more about the future scalability and capabilities of each software/option.

#4 There will always be future steps that I try to solve well before i'm at, so that my current plan for the future isn't going to hit a bridge it cant cross when I could have picked a different route at the beginning.


Everything is still on the table as future options, i'd like to map more of the future before I commit to even step one though so I don't relearn the wheel. (ie later switching from FreeNAS to Openfiler to Nexenta) I'll probably braindump my whole thought process every time I post in the future still hoping that holes in my thinking get corrected before I walk too far in the wrong direction anytime I make a bad call. :)
 

jec6613

Occasional Visitor
You know, Server 2016 has exactly the answer you seek: Storage Spaces Direct. You basically buy a couple servers (three minimum, don't try two) and stuff a bunch of drives in them. You can expand the servers with DAS later to add more drives, but storage spaces manages it all intelligently to survive an entire chassis failure, and even includes flash caches if you need them. You can also patch in sequence to maintain the service availability. Storage can be easily added on the fly, and even unevenly, as if you have three servers with, say, 10, 5 and 7 disks each, it will ensure that data is mirrored at the block level across two physical servers, maintained so long as one server never has more than 50% of all available disks. Basically, it's Drobo for enterprises.

VMWare offers their similar vSAN, which is somewhat less suited to your uses.

However, in most cases, using a good system with redundant PSUs and good server hardware is more than sufficient.
 

PolarBear

Regular Contributor
You know, Server 2016 has exactly the answer you seek: Storage Spaces Direct.

There was an (admittedly rather sensationalist) article this week mentioning that SSD has been removed from the latest build of Windows Server 2016.
https://www.theregister.co.uk/2017/10/18/vanishing_storage_spaces_direct/

From reading the comments to the article, my conclusion was that there are data integrity problems with it that Microsoft wants to iron out, but that it will not be abandonned. (Just my opinion).

However I don't have experience of SSD and of course, I could also be completely wrong.
 

jec6613

Occasional Visitor
There was an (admittedly rather sensationalist) article this week mentioning that SSD has been removed from the latest build of Windows Server 2016.
https://www.theregister.co.uk/2017/10/18/vanishing_storage_spaces_direct/

From reading the comments to the article, my conclusion was that there are data integrity problems with it that Microsoft wants to iron out, but that it will not be abandonned. (Just my opinion).

However I don't have experience of SSD and of course, I could also be completely wrong.
That's more because Windows Server was bifurcated - one build that follows the CBB Windows 10 build, and one that follows the LTSB. If you're deploying servers today, 99% of the time the newest server build to use will be the older Server 2016.

The Windows Server 1709 is a very niche use case, where staying on the latest OS build matters, things like Multipoint and App-V. Of the servers I'm responsible for, exactly zero are candidates for Windows Server 1709 ... but it does open some interesting future possibilities.

And I do run S2D, and it's just as reliable (once you get enough servers) as any other storage topology.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top