What's new

How To Build a Cheap Petabyte Server: Lessons Learned

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

markymark12

New Around Here
Congrats for your work on this forum, definitively a reference for me now.

I am thinking about offering backup services for small companies which I already take care off (as an external IT admin). I want to host a backup server at a solid hosting company which I already found. Since I will play per "U" rack spaces, I need to go with a most "dense" backup server solution.
I found the Backblaze POD very appealing. Here are my questions, any help/comment/advice will be welcome. Thanks already.

1. What about R1Soft backup software for doing the "complete system backup" for allowing bare metal restore ? Have you heard of it, did you use it and did you like it ?

2. How much "usable" space will be left after format + raid setup if using 45x1.5TB in RAID6 ? Calculation method ?

3. Am I right to consider RAID6 as the "safest/easiest" way to "control" disk failures. I'm not sure I understood well the setup they proposed (3 times 15x1.5TB raid6 volumes ?), is it that 2 drives per volume could fail before losing anything ?

4. Bare metal restore: lets say customer server fails totally, let's say it got stolen. They buy brand new one, totaly different server (diff cpu, diff hard disk, different mobo, different raid). How much will those different components affect the bare metal restore ? Will bare metal restore works as long as new OS disk partition are same size or bigger than original OS partition size/disk ? (not sure im clear enough on that question).
 
I found the Backblaze POD very appealing. Here are my questions, any help/comment/advice will be welcome. Thanks already.
Keep in mind that the Backblaze pod is not a NAS. But a high-density storage module that is part of a web-based backup service. There is a lot of software that doesn't reside on the pod that is instrumental in how it works.

1. What about R1Soft backup software for doing the "complete system backup" for allowing bare metal restore ? Have you heard of it, did you use it and did you like it ?
See comment above.

2. How much "usable" space will be left after format + raid setup if using 45x1.5TB in RAID6 ? Calculation method ?
As noted in the article: Formatted (useable) space is about of 87% of the 67 TB of raw hard drive capacity or about 58 TB per Pod.
Use any RAID 6 calculator you can find by Googling.

3. Am I right to consider RAID6 as the "safest/easiest" way to "control" disk failures. I'm not sure I understood well the setup they proposed (3 times 15x1.5TB raid6 volumes ?), is it that 2 drives per volume could fail before losing anything ?
A RAID 6 array will withstand two drive failures without data loss.

4. Bare metal restore: lets say customer server fails totally, let's say it got stolen. They buy brand new one, totaly different server (diff cpu, diff hard disk, different mobo, different raid). How much will those different components affect the bare metal restore ? Will bare metal restore works as long as new OS disk partition are same size or bigger than original OS partition size/disk ? (not sure im clear enough on that question).
That's really outside the scope of the article. The answer depends on the bare metal backup program you use.
 
I meant how-to custom 16tb+ storage on known os ?

Aight I didn't ask the good questions:

I need a high capacity storage server to do backups of my clients over their internet connection. Hosting is the the problem, the design and budget for the server is. Best deal I found are QNAP TS-809U-RP 16TB for 4K. Is it reliable enough to offer such online backu service ? I hesitate between such a bargain (with it proprietary OS software flexibility limitation) vs a true serveur on which I would install CentOS or Debian and install professionnal grade backup software like R1Soft and a smaller software like SycnBAckPRo for important data only backup). If solution 2 is the best, manufacturer like DELL and HP and others sells at too high prices, so I was thinking custom. Again I got a problem which remains only in the server case: I cant get a server backbone easely and hook a Areca ARC Raid controller in (total 2K) but the server case must be able to hold 16 hard drives (or more if available). Where to find such cases. Did any1 ever built a custom 16TB+ storage server on a OpenSource or Windows Server OS and with what design ?
 
Easy!
1 - Get as cheap Dell $300.00 with GbE - bare minimum
2 - Get Five bay HARDWARE RAID eBOX-R5 (on sale with 100.00 mail-in rebate)
If you need more the 5 drives then use 2x of this
3 - Use FreeNAS

Now you have 8TB RAID5 or 16TB raid 50 (using 2.0TB HDD). This NAS can transfered close to 85MB/sec, Note: eBOX-R5 can transfer over 200MB/sec if you use it as DAS
 
While it's true that the Backblaze pod isn't an uber-NAS, it would be quite easy to make one of it. The chassis is ideal in many respects. The system could be configured exactly as described or it could be altered with other off-the-shelf single- or dual-processor motherboards and/or true RAID controllers. It could easily be the world's largest MS Windows Home Server with little or no effort. HP's MediaSmart EX495 is a 90-ounce weakling by comparison. But of course, you'd need two: one to serve as a backup to the other.
 
My 180TB Backblaze build project.

Hello,

After reading the article on the Backblaze Storage pod, I couldn't help but want to build a small NAS device to store parts of the internet that seem to be disappearing.

I'm starting with 2 chassis', each filled with 45 2TB Hitachi drives.. Unformatted capacity is 180TB (not bad for 8u of rack space), I'll post pics of the formatted capacity and build pictures as I go.

To my surprise the chassis were shipped unassembled. Fortunately it came with screws :)

Here are some pictures to hold you over until the next post (probably monday) ;)

The chassis
http://img22.imageshack.us/img22/8713/shippingboxes.jpg
http://img22.imageshack.us/img22/7042/rails.jpg
http://img35.imageshack.us/img35/7428/mbchassisbackplane.jpg
http://img196.imageshack.us/img196/4451/internalfanhldr.jpg
http://img200.imageshack.us/img200/8854/fangrilleback.jpg
http://img36.imageshack.us/img36/4517/bottombo.jpg
http://img200.imageshack.us/img200/7875/toppart1hdgrid.jpg
http://img24.imageshack.us/img24/8827/bottomno2.jpg

The backplanes
http://img36.imageshack.us/i/backplanes.jpg/
http://img36.imageshack.us/i/bpfront.jpg/
http://img22.imageshack.us/i/bpbacko.jpg/

StorageMaster
 
Could you list out how much some of the components cost?

00Roush


The prices from the article or more or less correct.

I paid $1900 including shipping for the 2 chassis from protocase
The 20 Backplanes direct from CFI were $45/each + $40 for EMS worldwide shipping.

I haven't decided on MB or controller (The controller selection is limited because only certain silicon image chipsets support the port multipliers correctly).

The HITACHI HD32000 2TB Drives were $209 each from Newegg. They are currently sitting in customs. I chose the Hitachi's primarily because their drives are ROCK solid. No compatibility issues like WD, Seagate. They are the only HD manufacturer that doesn't RUSH anything. Their are faster drives out there, but for across the board compatibility Hitachi is the only way to go for me. I use the 1TB, 1.5TB and 2TB in everything from Dell PowerEdge servers, to hanging 24 off of Adaptec, 3Ware, Areca controllers in RAID 6, etc with ZERO issues.

I'm currently testing JFS and XFS on sever > 30TB software RAID 6's. The rebuild time is slow, and I haven't covered all scenarios yet, but I've yet to create a situation I couldn't rescue the data.

I would not recommend this configuration for mission critical data storage, or if you need > 300MB/s performance (which I'm estimating it will do), Unless you have some in house software engineers to make significant changes to the current raid software that is available.

Having said that, I think managed correctly (monitring drive temps with hddtemp, S.M.A.R.T data with smartmontools). Not putting all 45 drives in a single RAID-6 Software array. It can be managed in such a way to minimize data loss risk.

I plan on testing Pod-1 mirrored to Pod-2, using linear RAID on each box. The same drive # would have to fail on each pod in order to make that drives data not easily recoverable. As well as some other ideas. :)

StorageMaster

00Roush any relation to Russ Roush that used to live in Florida?
 
Last edited:
curious also...

here my 0.02c

After seeing that whole backblaze thing it reminded me of the sun thumper (which is considerably more expensive but more or less the same layout) and I was quite interested in designing something myself (although, i was thinking of having the drives laying down sideways and sliding in from the front on rails (would make replacing just one a bit easier.

What I didnt like about the backblaze though was the fact they were using sata repeaters for the backblanes on the drives, never had good experiences with those. But, what I had thought (which is kind of what the OP was thinking) is "this would be a great thing for backing up clients" (or even doing a DR-style scenario).

What I had in mind was along the lines of (with a linux based solution using lvm, iscsi, samba etc):

1) present iscsi lun(s) across slow link to client
2) async replication from their disks onto the iscsi lun(s) (there is software out there that can do this quite well - one i've used was designed to replicate a machine onto detachable storage), the reason for async was cause the server wouldn't slow down trying to write to slow storage and just "Catches up" when it can.
3) snapshot the iscsi lun every now and then just so you can do some point-in-time style things (i.e. get files from older versions of the file system)
4) if the server dies (permanently), bring it up "near" the storage pod inside a vm (i.e. a 1ru cheap server running something like xen or kvm on linux - you definitely need to deal with hardware differences, but there is software out there capable of that specifically designed with physical to virtual conversion)
5) migration back was an interesting one, and one piece of software i do have alot of experience with is acronis universal restore for windows (i've yet to see it fail to deal with a hardware change) - and this would be an "ok, we're going to migrate back, pick a time that suits you cause it'll require downtime"

You get a few advantages out of lvm and iscsi running on linux in the storage cause you can do things like resize the iscsi lun on the fly, snapshot, present it back to the client via samba, etc etc. i.e. if you had a snapshot of their replicated lun from the day before you could restore a file someone deleted by accident. With a bit of work you could actually automate alot of work via a website the client can login to themselves.

That was all a bit of a brain-dump, but hopefully gives some ideas of some things that could be done along those lines.
 
Update to my 180TB build

While I am waiting on the nylon spacers and other parts to arrive,

I wanted to make sure I would not run into any fitment issues with the chassis from Protcase. In addition to checking for small metal debris which could cause havoc.. I'm happy to report the chassis were FLAWLESS. For those that want a step by step guide to assembling the case, here is how it goes.

I downloaded solidworks edrawing program, and turned off the various layers. This is what the case looks like from Solidworks point of view.













These are actual photos of the build process.













Here are some misc photos of the completed chassis.
They come with all the needed nuts and bolts :)

Completed units.


http://img390
.imageshack.us/img390/9247/complete.th.jpg


With front cover on.


With Front cover on from back.


Some assorted nuts and bolts/screws




This is how the back plate mounts to the bottom of the chassis


All the electric parts and fans have arrived, so I'm going to start the wiring of the case, and building the power cable assemblies for the back planes and fans. :) Time to break out the soldering iron and heat shrink tubing.

Just realized the pictures are small.

Here is my profile link on imageshack to view the larger photos:

http://profile.imageshack.us/user/storagemaster/

Strorage Master
 
Last edited:
Raid 6 vs. ZFS-Z2

Eventhough it is yet one more OS to master, I am totally sold on ZFS-Z2 over software raid using deskstop disks.
I deeply regret that OpenSolaris does not support port multiplicator because it would be the OS of choice for storing data securely on a Petabox.

One thing I do not like at all in the Petabox design is the 2 independant power supplies. If one goes down, it could bring down half the disks in all arrays, greatly corrupting data.

I believe a power grid with 3 redundant psu (hotplugging is not mandatory) would be much safer.
 
RAID for the OS

All the focus is on the Storage (as it should be)
However this device is meant to run continuously why not consider RAID for the OS ?
It has redundant Power Supplies ...... but not drives for the OS ?
Persons that will build this type of device will likely have small old drives laying around.
Seems obvious to me since I have small unused drives install the OS on two old drives using RAID and have a ready backup.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top