What's new

My experiances with Opensolaris, ZFS, for home NAS

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

BrianC

New Around Here
Hi,

Been reading this forum regularly. Thought I'd share my experiences with my current NAS server running Opensolaris with 4.5TB, over 14 HDD's, using ZFS.

The current hardware:
1x CPU Intel E6300 1.8Ghz Dual core
1x Mother board, Gigabyte, GA-G33-DS3R
4x 2GB (8GB Ram, non-EEC)
4x Dual SATA PCI-E cards, Silicon Image Sil-3132
2x 40GB HDD (mirrored boot)
12x 500GB (storage)
1x old DVD drive.
1x 400W ATX PSU (Expensive one)
1x Power surge protector

Hardware, Opensolaris has a terrible driver by default for the onboard LAN. Its limited to 10Mb/s transfer speed after connecting at 1000Mb. It also puts the CPU to ~80% when active. I have to install the GANI driver to fix it. Cant get the 2nd onboard HDD controller to work. If I plug in a HDD all hell breaks loose, so its not used.

The System draws about 110W. The 14 HDD's spike the 5v line to over 32A on start up, found this out the hard way when it refused to power on. The current PSU is rated at 40A on 5v line.

I have a case I made myself out of wood and steel. Its not pretty but it does the job. The only problem I still have to deal with on the hardware side is vibration, 12x 7200 RPM drives really vibrate.....

My experiences with Opensolaris and ZFS.

Installation, Opensolaris is not the perfect finished operating system, its getting better all the time. For example it will not install on my Main desktop system and its similar to the spec above. Driver support really kills open Solaris, which is a shame. If open Solaris likes your hardware it will install quickly and easily. Your in for a real hard time if it takes a dislike to some of your hardware.

Setting up Server:
There are countless guides to setting up opensolaris, most of them are very good. Setting up is relatively easy. You can share over NFS and CIFS at the same time.

I have both windows and Linux machines, they happily talk to the server with out any problems. Except for ACL's, having 3 operating systems all doing there own thing in regard to ACL's is where things start to go a bit wrong. Now according to the ZFS documentation ZFS can handle all the ACL's. I have no doubt that it can, but the documentation is quite possibly the most confusing piece to literature ever written. Its about as much use as a chocolate fireguard.

This one line, gets my ACL's under control. There is a lot more to it than this, but if your having ACL problems, this is a good staring point.
chmod -R A=owner@:full_set:file_inherit/dir_inherit:allow /YourStorage/yourdirs

The ZFS bit:
The boot drives are mirrored via ZFS. On the current Opensolaris build 06/09, setting up a mirrored boot driver involves a bit of work and typing commands. Which is a backwards step, because in past builds all you had to do was select the two or more drives to mirror during install and it was all done for you.

The Data storage is divided in to two zpools, each with 6x 500GB drives via RaidZ.

Zpools are totally independent from everything. I can take the 6 drives that make the zpool and plug them into a completely different computer. Boot up computer with a ZFS operating system and do “zpool import” and my zpool will be fully up and running in 2seconds. No more raid controller failure or corrupt operating system destroying my data.

The 256bit check sum, this is a shocking eye opener to the world of hard drive data corruption. Of all the features on ZFS this is the best one by far. After seeing this in action with my own eyes, I will never ever go back to using mass storage with out some sort of check summing.

I have read about silent data corruption, and been the victim of it, but I never realized how bad it was on consumer hardware. I currently don't have any hardware problems on the server and its all running sweet. The most common check summer errors I have had recently, have been down to loose connections somewhere. Usually just after I have been working in or around server. With out the checksumming, I would never ever have known about these errors.

ZFS and Opensolaris, are not frightened to tell you about hardware errors. In fact I'd say ZFS is extremely good at finding faulty hardware and telling you all about it. When I build the first version of my ZFS box, I used loads of old hardware I had laying around. I used old hard drive that worked great under windows. I also bought some new parts as well.

The first server hardware death toll, errors found by that 256bit checksum.....
2 x 500GB drives brand new, ZFS found faults, disk unreadable after halfway, not silent
1 x 300 GB drive, thought was working OK, over 300 thousand checksum errors, all silent.....
1 x 250 GB drive, thought was working OK, lots of checksum errors, all silent.....
1 x 250GB, brand new, passed manufacture diagnostic test, ZFS Failed it after 24hrs. Silent errors..

All of these drives ZFS failed, however you can force ZFS to continues to use a faulty drive. ZFS will complain like hell but eventually the the message that you want to use the drive and just rack up thousands of errors. If you need to use a drive that has been failed by ZFS, use “zpool clear” and “zpool online” use it multiple times in succession and eventually the drive will stay online, but run up errors.

If one of your drives runs up sum checksum errors, don't panic. These are most likely the silent errors that you never know about before. Just clear the errors and scrub the zpool. Keep a note of the drive and see if it happens again.

Configuring ZFS zpools
One thing I will say, is if your building your first ZFS storage, KEEP IT SIMPLE!!
I have read loads of stories about “ZFS destroyed my data” 90% of these stories have the same route cause OPERATOR ERROR.

ZFS is simple, powerful and limitless in configuration options. This is where all the problems start. ZFS is limitless in the ways you can configure your HDD's. The more complicated you make your zpools the more likely your going to run into the No.1 cause of data loss YOU! I am not trying to be bad here but, read some of the ZFS horror stories. Most of them were going hideously wrong in the planning stage, long before the lost data was ever on the system.

Use raidz and raidz2. There simple and they give good data protection and good performance. Using plan old raidz will eliminate the main cause of data loss :)

Only use ZFS on raw whole disks, disks that are seen by format. If you don't do this things are going to get complicated, and I've already pointed out what that leads to.......

System performance: On my current setup, the whole thing runs very fast. I have gigethernet switch and cables. The system spec is very high for a NAS box, but I also use it to host virtual machines. Network transfer performance is, “interesting”. I can get a sustained 120MB/s (max speed of gig Ethernet) but this is heavily dependent on what is at the other end of the cable. The speeds I usually get are around 40MB/s read/write. I am happy with this speed. I could spend hours/days getting it to go faster, but its just not a priority.

I am very happy with the whole setup, it runs for months on end without missing a beat. I have been using it for 18months now. In the early days I was almost put off by Opensolaris not being a polished OS. Getting the whole thing working took days the first time around. Recent builds are improving. However if Linux ever gets ZFS in the kernel, I will be jumping ship to Linux!

If your building your self a NAS, take a long look at ZFS. The checksum error checking is its star feature.

Brian.
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top