What's new

[BTRFS] - urgent NAS bug - with Raid 5/6 parity calculations

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

sfx2000

Part of the Furniture
If you have a NAS that offers (and you have chosen) BTRFS as a file system on top of a RAID5/6 - backup now...

Seems there is a parity striping calculation error that can sometimes corrupt the entire array (and all data inside)

See links below;

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg55161.html

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg55179.html

Better to be safe that really sorry when you need to access that data...

@thiggins - might want to reach out to Synology, as they're one of the primary NAS vendors deploying BTRFS in their NAS solutions...
 
I think this is the right quote...

Regardless, what's /very/ clear by now is that raid56 mode as it
currently exists is more or less fatally flawed, and a full scrap and
rewrite to an entirely different raid56 mode on-disk format may be
necessary to fix it.

And what's even clearer is that people /really/ shouldn't be using raid56
mode for anything but testing with throw-away data, at this point.
Anything else is simply irresponsible.

Does that mean we need to put a "raid56 mode may eat your babies" level
warning in the manpage and require a --force to either mkfs.btrfs or
balance to raid56 mode? Because that's about where I am on it.
 
Whichever system you use if you only store your data on one device then you have no backup. That's fine for data you don't care about, but for important data backups are a must.

For our NETGEAR ReadyNAS we've been using BTRFS since 2013 and we are unaffected by the problem you've linked to above.

Since 2013 when we introduced our first devices with BTRFS we've always used mdadm RAID with the BTRFS filesystem on top of that. The mdadm RAID is at a lower level than the filesystem. We've been using mdadm in our NAS units since 2008 (when we launched the ReadyNAS Pro running RAIDiator 4.2.x) and mdadm has been around since 2001. mdadm RAID is great and very mature.

We have never used the RAID 5/6 mode of BTRFS RAID, having never been satisfied that it was production ready. Not long after we announced our first devices with BTRFS, the first version of BTRFS to have experimental inbuilt RAID 5/6 support was released. There's no way we were going to use new highly experimental code, especially for something as important as RAID.

I'd much rather use mdadm + BTRFS than use mdadm + LVM + EXT4.

Not sure what other brands are doing. The one you mentioned only started using BTRFS last year I think.

Other areas of concern in my view would be if a very old kernel is being used or a very old version of btrfs-progs. It's important to regularly update these to get the latest fixes as well as over time once satisfied with the stability start enabling additional features. You can check which version is being used by a product by looking at the GPL code.

In ReadyNAS OS 6.5.1 we use the 4.1 LTS kernel and btrfs-progs 4.4.1 (a newer version of btrfs-progs will be in ReadyNAS OS 6.6.0). We routinely update to newer kernels within the 4.1 LTS branch and backport additional patches as needed as well.

Whichever system you use if you only store your data on one device then you have no backup. That's fine for data you don't care about, but for important data backups are a must.
 
Last edited:
Whichever system you use if you only store your data on one device then you have no backup. That's fine for data you don't care about, but for important data backups are a must.

For our NETGEAR ReadyNAS we've been using BTRFS since 2013 and we are unaffected by the problem you've linked to above.

Since 2013 when we introduced our first devices with BTRFS we've always used mdadm RAID with the BTRFS filesystem on top of that. The mdadm RAID is at a lower level than the filesystem. We've been using mdadm in our NAS units since 2008 (when we launched the ReadyNAS Pro running RAIDiator 4.2.x) and mdadm has been around since 2001. mdadm RAID is great and very mature.

We have never used the RAID 5/6 mode of BTRFS RAID, having never been satisfied that it was production ready. Not long after we announced our first devices with BTRFS, the first version of BTRFS to have experimental inbuilt RAID 5/6 support was released. There's no way we were going to use new highly experimental code, especially for something as important as RAID.

I'd much rather use mdadm + BTRFS than use mdadm + LVM + EXT4.

Thanks for the clarification for ReadyNAS, and confirmation - and I agree with you - it is a requirement perhaps to have a backup plan for the NAS - and something that the Vendors could make more clear in the user document (Welcome to your new NAS unit - here's a backup strategy recommendation...)

The risk is using BTRFS for the actual striping of the array - if one builds an MD Raid Set using mdadm, they should be good to go, as correctly pointed out above, mdadm is stable, mature, and has been for many years.

mdadm + lvm + ext4 (or I prefer xfs myself in a mdarray) is safe. LVM also has the capability of striping the disk set all by itself, and this, many would consider to be unsafe.
 
Might be worth having primary and backup on different file systems as well to provide safety if / when future issues are discovered.

So primary on ZFS, backup on UFS for example.
 
Personally, I prefer to stay away from esoteric filesystems. A filesystem is too critical to take any chance with it, and I'd rather go with something that has countless hours of usage world-wide such as ext3 or ext4 than something that has a fraction of the mileage, and so can pop up nasty surprises at some point down the road.

Reason being, I got burned by reiserfs back when everyone claimed it was the best thing to ever happen to Linux's filesystems. Since then, nothing but ext2/ext3/ext4 for my Linux boxes.

Story-time: a few months after I got slightly burned (I had backups) by my own Linux server getting file corruption with ReiserFS, there was this customer who had appointed a local Linux consulting box to setup a failover web server setup. From the start I told my customer I didn't like the amount of amateur hacking that was involved in this consultant's solution (it even involved a small hand-soldered circuit board allowing one PC to reboot the other one), and they chose to use ReiserFS for both servers. Told my customer what I thought about reiserfs, but consultant insisted that it was "more reliable and faster".

Came deployment time in the datacenter, after a few weeks of testing within their office. That very evening we tried to turn on both servers in the datacenter, booting fails - filesystem was corrupted. The customer eventually decided to listen to me, and scrapped the whole project. They sure dodged a bullet that time...

They "dismissed" that Linux "expert", and we settled on a VMWare-based solution instead, with nightly rsync of the data between a production VM and a hot standby VM. And ext2 (or ext3, forgot what was current at the time).

Today, we have roughly 150 VMs running under XenServer with that customer. 15+ years later, we experienced one single case of filesystem corruption with ext3, and we suspect it was caused by a RAID controller that had been suspicious at the time.

So, no matter how many praises I hear about ZFS, XFS or BTRFS, I prefer to stick to the true and (heavily) tried filesystems instead. Simpler is less likely to break.
 
I tend to agree with a couple of caveats... With Linux, one cannot do wrong choosing EXT4 these days...

ZFS under Solaris works very well - but it's tightly coupled with the solaris kernel, and there's a lot of QA and active development there... It wouldn't be my first choice with Linux or BSD, as the ZFS implementation is based on an older branch that Sun released, and then Oracle pulled support.

Same with UFS under the various BSD's, tried and true - XFS under RedHat is reliable, and has stood the test of time, and there are use cases where it's a better choice...
 
Last edited:
The BTRFS user base is continuing to grow. One of the biggest users of BTRFS, perhaps the biggest is the tech giant FaceBook. With their large requirements they've been able to greatly help development of the filesystem.

EXT4 isn't much older than BTRFS, though EXT4 was a port of an existing filesystem EXT3 which was released years several years before.

The BTRFS on-disk filesystem format is stable.

It is advisable to update the kernel from time to time, to get performance improvements, new features, kernel bug fixes and security updates (this is true with any filesystem). We won't use a new feature until we are confident in its stability.

You can easily check which version of btrfs-progs and which kernel is used by the NAS manufacturer you are interested in by looking at the GPL code which they are obliged to provide and by comparing the GPL for different firmware versions you can see whether much effort is put into updating the kernel.

You can see reviews of different kernel series to see how good reviewers think they are. It is true that kernel patches can be backported and we do include backported patches where needed, but there does come a time where doing a major update to the kernel is advisable. We updated to the 4.1 LTS kernel series in ReadyNAS OS 6.4.0. Regular minor updates to get bug fixes and security fixes should be done as a matter of course. Most ReadyNAS firmware updates have a minor kernel update.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top