What's new

I think I want to use SnapRAID (talk me into or out of it :)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Is SnapRAID the best starting point for my needs?

  • Yes

    Votes: 0 0.0%
  • No (please post your alternative below)

    Votes: 0 0.0%

  • Total voters
    2
  • Poll closed .

Twice_Shy

Occasional Visitor
Btw I haven't caught up on ANY of the past postings yet, was out of town and there was a bunch of talk on some threads here if i'm not mistaken... so if something is redundant advance apologies, I just wanted to sit this out to bake for a little first.


I stumbled across SnapRAID the other week and I think i'm most interested in at least starting there. It may not be on the only NAS box in the house, but I saw a list of features that I really liked. People here probably know way more than me so either please talk me into it being a good choice, or out of it because I don't know enough about other options.

- I like the fact it works over the filesystem instead of under it, meaning nothing gets in the way of say normal drive recovery tools. I can replace a drive by simply mirroring it. I can upgrade a hard drive size by copying all files to a new drive and resynching.

- This also lets me apply my desire to have a redundant array of inexpensive tape. A set of LTO Ultrium tapes, which has one or more parity tapes, capable of restoring all the data in the set even if one or more tapes breaks or becomes irrepairable. Yes I also plan mirroring, but if the wrong two same tapes break somehow in a mirrored set all data gone forever. Parity tapes let me recover that. Since SnapRAID seems to put this all in common files, I can just write the parity files to tape. (provided that LTFS supports stamps with microsecond accuracy which SnapRAID requires apparently)

- I like the fact only one disc spools up at a time, unless you are synching. This minimizes power use and makes me less concerned about using a bunch of drives because they can just sleep most of the time. Since earlier work will mostly be done locally on a PC (with it's own RAID1 on the workdrive) the day's work can be backed up to the NAS and synced from each PC. This is less convenient when work shifts to the NAS I know, it's just to start.

- Easier to buy small and scale out than FreeNAS ZFS. Easy to upgrade drives, add drives, etc. ZFS wants me to buy all the drives for a pool at the start to set up a vdev and makes flying upgrades a PITA, i'm looking for something more like Drobo functionality. (upgrade and add drives whenever) Much much lower system requirements too - a 32tb ZFS system wants 32gigs ECC RAM and not many systems far over that size. 70tb SnapRAID systems are the norm and run in under 8gigs RAM by what I can tell.

- My top priority is not running ZFS but having the main advantages I saw as being an upgradeable storage pool, redundancy, parity, and file integrity verification. SnapRAID doesnt have things like convenient snapshots, undelete, deduplication or any of that, which is why a FreeNAS or other system is probably still an upgrade. I will get there eventually. For now I just need to get started with a better system than I had before. (dumping it on external USB drives to find multiple things silently corrupted or lost by other means) And that is the system that migrates data for tape storage ASAP.

None of this means that my research is complete or over! This is just rather the first system I think I would like to set up. Which I hope to build and have online before anything else - while simultaneously researching the 'future upgrade options' that might be better, more convenient, better production workflow or speed or anything. Heck i'm pretty sure a monolithic ZFS system is better than a SnapRAID of the same size - but it costs more and I cant easily write stuff to tape either, so pretty sure SnapRAID will always be on at least one NAS box to ease data migration to Ultrium LTO tape. Since it will always be here on the disk-to-tape mirror system anyway.

Left unsaid but open to discuss are when I need to migrate to higher performance systems up to and including SANs and such. :) Assuming SnapRAID works like I want it to, it gives me a starting point, and makes it less of a holdback for what type of NAS/SAN system to move to in the future.
 
Hi,

I am using SnapRAID for years on my home server with 6 disks (2x 8 TB and 4x 5 TB). Based on my needs and experience I can definitively say it's the right choice for me! Only once - but that's enough - I needed to recover one 2 TB partition completely due to admin error - worked like a charm! :rolleyes:

For me the biggestet advantage of SnapRAID (or any software RAID) is that all disks can be formatted nativ unter your system (in my case EXT4 on Ubuntu). This allows you to use the disks independently of your original system or RAID controller. Even exchanging disks or partition size is not a problem!

The biggest disadvantage is that it's a software solution, which need to run regularly and need to be checked! To overcome and avoid issues my system runs SnapRAID every night and reports back (via email) if there was any error.
And (of course) you do not get any real RAID advantages (like faster reed from multiple disks) of this solution - but this is not what I was looking for.

With kind regards
Joe :cool:
 
Hi,

For me the biggestet advantage of SnapRAID (or any software RAID) is that all disks can be formatted nativ unter your system (in my case EXT4 on Ubuntu). This allows you to use the disks independently of your original system or RAID controller. Even exchanging disks or partition size is not a problem!

The biggest disadvantage is that it's a software solution, which need to run regularly and need to be checked!

And (of course) you do not get any real RAID advantages (like faster reed from multiple disks) of this solution - but this is not what I was looking for.

Right and those are part of my starting positions. I dont need the speed of RAID, i'm more concerned about file recovery and scalability.

Perhaps everyone isn't familiar with my other posts so I guess I better redo a short version - I wanted to build a system that could start maybe around 30TB and scale out eventually to around 300TB in a storage pool. FreeNAS ZFS cant even do that, very few people build larger systems than the 32-64 range by the sound of it. I'm trying to see if anything else is even in the running. My main plan was to migrate much data rapidly to Ultrium tape to be worked on more in the future. Right now I need mostly to store dumb data that wont be worked on all that much that can scale out with low overhead (over individual hard drives and tapes bought as the data piles up) cost to as much as a petabyte with a max of 1/3 of that online in the future. (this is for some SERIOUS 8k HDR digital video, multiple HD stream motion capture footage with 8-16 cameras possibly more in the future, and similar film projects) In the future data will migrate from tape back to the expanding NAS for more work and then need more performance upgrades. Then the primary disk NAS/SAN (in the D2D2T chain) i'd like to saturate 4-10gig fibrechannel/infiniband at least on SSD RAID and ramdisk volumes.

I eventually plan multiple uses for a NAS:

- Preparing files for backup to tape (probably in a disk-to-disk-to-tape system, the primary NAS is backed up by a secondary NAS, which directly writes to the tape at a more leisurely pace). Right now both would be SnapRAID, in the future the first disk may be something else, but the 2nd always has to be SnapRAID for my Redundant Array of Inexpensive Tape data migration plan.

- Network boot images or/and network based drive imaging every day. This doesn't mean fully diskless, i'm aware thats much more of an IOPS and bandwidth load. I'd like to play with fully diskless later, but to start I just want centralized management and updates. So each workstation may have it's own hard drive or later SSD. (option B is backing up dedicated install systems to a single house NAS)

- Normal NAS duties (the primary disk of the D2D2T system) such as storing workfiles and such to saturate gigabit ethernet which a single drive can do. Eventually faster, this is just to start to be comparable to local single disk performance on any given workstation.

Again my must haves include file integrity verification like ZFS (the whole thing that led me to ZFS, only to get frustrated with it) and protection from data loss via failures (of disks or whatever) like any RAID or RAID alternative. Realtime isn't essential for now. With files rapidly ending up on Ultrium tape i'm less worried about some of the file rollback options since I just plan to store multiple revisions over time so I can go back to a previous workstage in editing or whatever.

A potential barrier - we may be talking tens of millions of files. I'm already averaging 100k files per disk on the array I had before which included for instance stock images and sound effect wav files and I want future expansion growth. I'm not sure how SnapRAID handles that offhand or if it's an issue.
 
ArchLinux has a wiki entry on SnapRAID - might want to review it.

Thanks for the heads up. Reading up there I guess it doesnt do storage pooling, though that isn't 'life or death' critical just a nice to have.

It looks like large files are better - has anyone had experience with many small files? (million plus) Is there some recommended range of # of files where it starts to bog down, cause problems, bottleneck performancewise (ie bunches of files below X size) or whatever? I can always put large files on one box and smaller ones on another - i'm not against workarounds for now. This isn't the end all be all, just the starting solution.


I'm wondering if I can run more than one software NAS type solution on the same box - literally having some SnapRAID volumes, and maybe some other type of volumes (not sure which yet - open to suggestions) like for the smaller files for instance for the sound/textures library.
 
It looks like large files are better - has anyone had experience with many small files? (million plus) Is there some recommended range of # of files where it starts to bog down, cause problems, bottleneck performancewise (ie bunches of files below X size) or whatever?
My "biggest" volume has currently 249 466 directories with 1 001 619 files - containing mainly small files: backup of 3 client PCs with history of all changes for 1 year - kind of TimeMachine/continuous backup on change (incl. all Windows Config Files).
No issue at all with SnapRAID - works like a charm! :D

I'm wondering if I can run more than one software NAS type solution on the same box - literally having some SnapRAID volumes, ...
Do you plan to run SnapRAID on the NAS or on another system? If you run it outside of NAS you will have issues with performance and loose the other advantages of a software RAID...:rolleyes:
 
SnapRAID might be useful as "near-line" storage, backing up the primary online filesystem...
 
My "biggest" volume has currently 249 466 directories with 1 001 619 files - containing mainly small files: backup of 3 client PCs with history of all changes for 1 year - kind of TimeMachine/continuous backup on change (incl. all Windows Config Files).
No issue at all with SnapRAID - works like a charm! :D


Do you plan to run SnapRAID on the NAS or on another system? If you run it outside of NAS you will have issues with performance and loose the other advantages of a software RAID...:rolleyes:

Thats good to know! Though I will probably be hitting multiple millions at some point... many many small WAV files and sound effects in a huge library, many texture files and stock photography images, etc.

Doesnt taking snapshots create huge files though like redundant backups almost? Or I suppose it only does changed files when they change.

To SnapRAID... i'm not sure! I'm actually not sure how it works yet! :) But the plan was to run SnapRAID directly on a NAS box (I thought that was the only way it COULD run, is there another way??) though I was considering making that a multipurpose box to start. Ie roomie messes with torrents, and likes to rip new DVD's into his collection, and would like to serve that up as a separate volume to the smart TV's... so i'm wondering if I could just stick SnapRAID on there since that box wants to be up 24/7 anyways, or I might ask him to build a new box which he was talking about and let me throw some drives in it, until I migrate it out to a separate system. It sounds light enough on cpu/memory power I was hoping it'd be no problem. I suppose it depends more on what the other software is doing.

I don't have to merge my box with him it just saves a little on power and energy if it's going to be up 24/7 anyways, and might even turn his volumes into snapraid backed up volumes as well!
 
SnapRAID might be useful as "near-line" storage, backing up the primary online filesystem...

Any reason it cant be both for now? I assumed as long as the bandwidth of a single shared disk was enough it would be fine. I'm starting out single user, and future multiuser may well be set up to access different volumes at the same time to literally stay on separate drives.
 
Any reason it cant be both for now? I assumed as long as the bandwidth of a single shared disk was enough it would be fine. I'm starting out single user, and future multiuser may well be set up to access different volumes at the same time to literally stay on separate drives.

Give it a try and see...

The only risk I see is recovery if things go awry - mdadm/ext4/LVM is well understood, and there are plenty of ways to recover data from even the most complicated of arrays when things go wrong (and eventually over time, things will need some help here, as probability does indicate that the more physical disks in an array, the higher the odds are that things will fail - dirty little secret of RAID I suppose - and RAID5 is probably the most scary outside of RAID0)
 
Give it a try and see...

The only risk I see is recovery if things go awry - mdadm/ext4/LVM is well understood, and there are plenty of ways to recover data from even the most complicated of arrays when things go wrong (and eventually over time, things will need some help here, as probability does indicate that the more physical disks in an array, the higher the odds are that things will fail - dirty little secret of RAID I suppose - and RAID5 is probably the most scary outside of RAID0)

That's actually one of the secret strengths of SnapRAID - since it works over the top of the filesystem to achieve RAID like functionality, the files themselves are untouched. There's no data to migrate into or out of some specialized filesystem or even disk format - it's not like setting up RAID arrays, you can SnapRAID over the top of existing full disks of data with no free space, and stop using it at any time. If data has to be recovered you just pull a drive and use existing tools.

The only possible downside is that i'm not sure if it's open to all filesystems inherently - like it works for NTFS on windows and Ext3 on Linux, i'm less sure about Ext4, Brtfs and similar.

I dont even know what mdadm and LVM are. So in part my question was wondering whether future needs (like network booting from the primary NAS) need to use different kinds of incompatible file systems.
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top