Build Your Own Fibre Channel SAN

GregN · May 17, 2011

I'm seeing in NASPT benchmark mysterious performance figures for File Copy to NAS - all the other figures are where you would expect them to be, but this figure is anomalously way low. Stripe size isn't enough to explain the figure, and (a bit of a preview) it doesn't get better under SAN performance. Any ideas?

Also anyone have hands on experience tuning Openfiler for 3Ware raid controllers? Lessons learned?

tazdevil · May 17, 2011

Build your Own FC SAN - Connection Question

From the diagram, it appears this storage sub system will be connected into a server via Fiber? can you please clarify? also is connecting the fiber into your network provided your switches support fiber an option? or only connected via ethernet?

GregN · May 17, 2011

tazdevil said:
From the diagram, it appears this storage sub system will be connected into a server via Fiber? can you please clarify? also is connecting the fiber into your network provided your switches support fiber an option? or only connected via ethernet?

Part two of the series will answer all of these questions.

But the plan is to connect the SAN server via fiber to a DAS server ( my mainline Windows 7 box ), the Windows 7 DAS server sees the SAN as a disk, albeit a very large one - a disk that can be shared like any. That is how the storage is shared over the network.

The scenario that you mention, a switch that handles both your tcp/ip traffic and fiber traffic doesn't really work, they are two different protocols, tcp/ip and fcp ( fibre channel protocol ). Sort like expecting Air Traffic Control to handle traffic jams on Highway 95.

Switches exist for fiber but we aren't using one (outside of the 1K budget), there might be switches that handle iscsi to fcp translation, I don't know. But again, expensive. You can run ethernet on a fiber cable, but that isn't fibre channel.

mjw · May 24, 2011

Add an SSD cache?

Given the performance results are below that of a small Sandforce SSD, I'd be very interested to see if something like Flashcache (http://planet.admon.org/flashcache-caching-data-in-ssd-disks/) would boost your performance results (especially small file size) for an additional $100?

Also, I'm not sure how the on-board Intel ICH-10R performance compares with the RAID card you used, but I have read a few articles where the ICH-10R paired with a decent CPU has superior performance to dedicated RAID cards:
http://www.tomshardware.com/reviews/ich10r-sb750-780a,2374-11.html

However, I suppose the limited number of SATA ports (6) would restrict the maximum capacity of your array to about a third of the 35TB that you are shooting for.

GregN · May 25, 2011

Re: SSD Cache

mjw said:
Given the performance results are below that of a small Sandforce SSD, I'd be very interested to see if something like Flashcache (http://planet.admon.org/flashcache-caching-data-in-ssd-disks/) would boost your performance results (especially small file size) for an additional $100?

Also, I'm not sure how the on-board Intel ICH-10R performance compares with the RAID card you used, but I have read a few articles where the ICH-10R paired with a decent CPU has superior performance to dedicated RAID cards:
http://www.tomshardware.com/reviews/ich10r-sb750-780a,2374-11.html

However, I suppose the limited number of SATA ports (6) would restrict the maximum capacity of your array to about a third of the 35TB that you are shooting for.

Supposedly ICH10R will handle port multipliers, addonics has a 5x1 ($50), but I'm still limited to 3Gbps per port right? Won't performance fall dramatically if I put 20 drives on 4 PMs?

The Tom article you shared ( Thanks ) does recommend in the conclusion using a RAID card for a dedicated array, better consistent performance supposedly

I'm running SSD as the system drive, my swap is on SSD...it was a part reused from a previous build, A-Data 32GB. I'll take a look at flash cache, a kernel module? Wonder if it'll work on Openfiler.. Appreciate the pointer

Processor + X58/ICH10R Motherboard would also bust my $150 Budget for the MB/Proc.

If money didn't matter, give me one of those sweet new gen Areca 24Port SAS/SATA 4gig cache RAID cards, man I'd like to see that fly. Areca, if you are reading this, I'd be happy to review one of those puppies, provided I could keep it...

GregN · May 25, 2011

Been tuning performance for part 3 of the article, looking at the DAS server ( which gateways my SAN Node ), the SAN jets, but I lose alot when shared - I'm wondering if there are any ways to squeeze more out of SMB2 under Windows 7.

Anyone got experience tuning SMB under Win7?

Unregistered · May 25, 2011

Perf/Power Consumption and TCO essential considerations

In the following articles you need to look at per of performance per power consumption. Even setting aside the env. considerations--the cost of running a server that consumes hundreds of watts idle is going to add up bid time.

mjw · May 25, 2011

GregN said:
Supposedly ICH10R will handle port multipliers, addonics has a 5x1 ($50), but I'm still limited
to 3Gbps per port right? Won't performance fall dramatically if I put 20 drives on 4 PMs?

If you want to go cheaper, an ICH-9R may have similar performance. Regarding the port multiplier question - if you are able to find a compatible PM then even with 5 drives connected to each PM, the theoretical max bandwidth per drive with 5 drives connected is still 75MB/s. If you add a separate SATA controller dedicated as the system drive then perhaps you could spread the load across all 6 ports on the ICH-10R.

Please do post back and let us know your experiences with flashcache.

GregN · May 25, 2011

Total Cost of Ownership

Unregistered said:
In the following articles you need to look at per of performance per power consumption. Even setting aside the env. considerations--the cost of running a server that consumes hundreds of watts idle is going to add up bid time.

Power consumption is a concern, I spent extra dollars of the budget for a smaller ( 750w vs 850w or 1000w ) high efficiency PS for that reason ( pay me now or pay me later....). I will also be able to decommission my old Thecus NAS and consolidate my storage, many small drives replaced with fewer, new higher capacity and more power efficient ones - but there is no getting around the fact that running a high capacity disk array will up your power bill, question is, how compelling is your need.

I live in Arizona, power costs are a significant part of my monthly nut that can't be ignored, so I'm aware of this factor.

All that said, the focus of the article is a proof of concept, can you build an enterprise grade, high performance, large capacity fiber SAN on the cheap...in your basement? And how do you do just that.

The Green concerns are best served by another follow-on article.

mjw · May 26, 2011

Here are a couple of truly impressive DIY NAS projects that might also be of interest:
1) An ambitious build out of a large ZFS based NAS
http://www.anandtech.com/show/3963/zfs-building-testing-and-benchmarking
2) The magyver-like video of the build of the "Black Dwarf" NAS where he actually builds the enclosure from scratch:
http://www.ehomeupgrade.com/2010/05/07/monster-diy-nas-black-dwarf-build-documented-on-video/

GregN · May 26, 2011

Other Builds

mjw said:
Here are a couple of truly impressive DIY NAS projects that might also be of interest:
1) An ambitious build out of a large ZFS based NAS
http://www.anandtech.com/show/3963/zfs-building-testing-and-benchmarking
2) The magyver-like video of the build of the "Black Dwarf" NAS where he actually builds the enclosure from scratch:
http://www.ehomeupgrade.com/2010/05/07/monster-diy-nas-black-dwarf-build-documented-on-video/

Very Cool, Thanks.

The AnandTech sol'n most interested me, I've been wondering about Nexenta - I am confused, the community edition from the website says 18TB, but alot of sites say 12TB. Recently changed, or ZFS overhead?

I will probably look at Ol'Shuck running OpenSolaris and ComStar ( the underpinnings of Nexenta )

Of course both are outside my budget range, the case of the Anandtech NAS costs more than my entire kit. Performance-wise I'd put Old Shuck up against the AnandTech solution ($10K) any day, Shuck would give it a run for its' money, at 1/10th the cost. It is in a different class of build - I do though lust after the Supermicro chassis ( tried to find one used for my project, no go )

Besides mine has the cool name, while the AnandTech box remains a nameless datacenter denizen.

00Roush · Jun 3, 2011

GregN said:
Been tuning performance for part 3 of the article, looking at the DAS server ( which gateways my SAN Node ), the SAN jets, but I lose alot when shared - I'm wondering if there are any ways to squeeze more out of SMB2 under Windows 7.

Anyone got experience tuning SMB under Win7?

At the very least you might want to try changing the HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\Size registry key to "3". I know this change has been recommended since Windows 2000 when the consumer versions of Windows are used in a sever role. When I was testing using Vista SP1 (and also Win XP PRO) as a server OS I had to make this change to consistently get high performance file copies. Other things I have found to help with performance are disabling the Windows firewall and also changing the power settings from balanced to high performance.

I also should mention that while Windows 7 supports SMB2 you are probably only using SMB1.0 to communicate between your Openfiler NAS and the Windows 7 client. So far I believe only Samba 4 supports SMB2 and to my knowledge just about all linux/unix OSes use Samba 3 as the default. This of course only applies when using the Openfiler box as a NAS. I bring this up because in my own testing I found that when using SMB1.0 to communicate Windows 7 ends up only using 32k request sizes to/from the server on large files. With your stripe size being 64k and 256k this might be part of the reason your "file copy to NAS" results were a bit low. Of course this is purely speculation and it all depends on how Openfiler is dealing with the smaller sequential I/O sizes.

As for Nexenta... there a couple of different versions. The 18 TB limit (recently changed) is for NexentaStor Community edition. This version of Nexenta is geared for enterprise storage (ie SAN and iSCSI) and comes with a Web front end that is quite good. To my knowledge NexentaStor is built on top of Nexenta Core which is generally command line based and does not have any storage limits. From what I understand a web front end can be added to Nexenta Core though. A while back I tested NexentaStor Comminity edition out and was very impressed with its performance and ease of use. It was good enough that I am considering using NextenaStor instead of Windows Server 2008 R2 on my home server.

Your project is quite interesting. I look forward to seeing the rest of your results.

00Roush

GregN · Jun 5, 2011

SMB etc

00Roush said:
At the very least you might want to try changing the HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\Size registry key to "3". I know this change has been recommended since Windows 2000 when the consumer versions of Windows are used in a sever role. When I was testing using Vista SP1 (and also Win XP PRO) as a server OS I had to make this change to consistently get high performance file copies. Other things I have found to help with performance are disabling the Windows firewall and also changing the power settings from balanced to high performance.

I also should mention that while Windows 7 supports SMB2 you are probably only using SMB1.0 to communicate between your Openfiler NAS and the Windows 7 client. So far I believe only Samba 4 supports SMB2 and to my knowledge just about all linux/unix OSes use Samba 3 as the default. This of course only applies when using the Openfiler box as a NAS. I bring this up because in my own testing I found that when using SMB1.0 to communicate Windows 7 ends up only using 32k request sizes to/from the server on large files. With your stripe size being 64k and 256k this might be part of the reason your "file copy to NAS" results were a bit low. Of course this is purely speculation and it all depends on how Openfiler is dealing with the smaller sequential I/O sizes.

As for Nexenta... there a couple of different versions. The 18 TB limit (recently changed) is for NexentaStor Community edition. This version of Nexenta is geared for enterprise storage (ie SAN and iSCSI) and comes with a Web front end that is quite good. To my knowledge NexentaStor is built on top of Nexenta Core which is generally command line based and does not have any storage limits. From what I understand a web front end can be added to Nexenta Core though. A while back I tested NexentaStor Comminity edition out and was very impressed with its performance and ease of use. It was good enough that I am considering using NextenaStor instead of Windows Server 2008 R2 on my home server.

Your project is quite interesting. I look forward to seeing the rest of your results.

00Roush

Roush,

I had tuned several LANMANSERVER params, but size was not one of them. What does SIZE represent? I presume since you got positive results to setting that parameter, that changing LANMANSERVER parameters helped. None of what I read indicates that changing these necessarily helps on a "Workstation" OS, like Vista and Windows 7, the doc is ambiguous at best. Good to know.

I'm trying to tune the HTPC to DAS link, the DAS talks to linux via fibre channel and scsi - so Samba doesn't come into play. The HTPC is running Win7, as is the DAS server (hence SMB2). The best results so far have been achieved (very significant) by changing the window's write back policy on the SAN disks.

Thanks for solving the 18TB mystery, I wasn't sure what the deal was. Have you looked at Open Indiana, another openSolaris spin off? Regretfully I've already hit +30TB raw on Old Shuck (SAN Node), Nexentastor is a no go for me, but the UI looks impressive, much better than openfiler.

Part two is in the can, and should be published soon. The tuning is for part three.

wolfplusplus · Jun 6, 2011

GregN said:
Very Cool, Thanks.

The AnandTech sol'n most interested me, I've been wondering about Nexenta - I am confused, the community edition from the website says 18TB, but alot of sites say 12TB. Recently changed, or ZFS overhead?

I will probably look at Ol'Shuck running OpenSolaris and ComStar ( the underpinnings of Nexenta )

Of course both are outside my budget range, the case of the Anandtech NAS costs more than my entire kit. Performance-wise I'd put Old Shuck up against the AnandTech solution ($10K) any day, Shuck would give it a run for its' money, at 1/10th the cost. It is in a different class of build - I do though lust after the Supermicro chassis ( tried to find one used for my project, no go )

Besides mine has the cool name, while the AnandTech box remains a nameless datacenter denizen.

Hi Greg,

Great series, I've always been curious about using fiber in the home lab!

I built an all-in-one NAS+ESXi box recently using Solaris 11 Express (previously using OpenSolaris) and I'm very happy with it. Since everything is running on the same box, I elminate the need for a fast connection to my storage. I used a kill-a-watt on the system and with 8 drives at idle it eats just over 100watts. When rocking and rolling heavy I/O it hits just over 140. Article here.

-Nelson

GregN · Jun 6, 2011

Virtual Old Shuck

( Part two is up! )

On ESXi et al, I am more interested in using virtualization to multipath (FC, iSCSI) to a shared NTFS mount.

iSCSI -> Old Shuck
\
-> ESXi scsi Block IO [NTFS] (2 VMs? )
/
FC ->Old Shuck

Have you or anyone tried that?

Peter · Jun 7, 2011

Why Use The DAS Front-End?

I'm sure this is a real noob question, by why use the DAS front end server? Why not just connect the SAN to the Gigabit network and access it directly?

It seems like if the DAS is connected to the network by Gigabit, then that is going to be the bottleneck, and the fact that you can get the data from the SAN to the DAS front end faster than it can be pushed over the network is kind of irrelevant and overkill.

GregN · Jun 10, 2011

Why the DAS? *LONG*

Peter said:
I'm sure this is a real noob question, by why use the DAS front end server? Why not just connect the SAN to the Gigabit network and access it directly?

It seems like if the DAS is connected to the network by Gigabit, then that is going to be the bottleneck, and the fact that you can get the data from the SAN to the DAS front end faster than it can be pushed over the network is kind of irrelevant and overkill.

Not sure what the question here is, why not run the Old Shuck as a NAS? or Why not run iSCSI to each node instead of FC? or What is compelling about a SAN?

I think I deal with these in the article, but let me try to address them for you here, first rephrase the two possible questions you might have.

Question One, Why not run as a NAS, why go to the trouble of running a SAN?

Question Two, Why not run as a shared SAN using iSCSI instead of as fiber with a DAS?

If what you are asking is why not run it as a gigabit NAS - it wouldn't be a SAN, it would share the filesystems and not the block devices. The primary reason is that I can get better performance by running a < SAN to DAS > as a logical NAS, versus running it as a physical NAS. There are several reasons why this is, one is the simple reason that I have more asymmetric processor oomph behind the storage ( take a look at the benchmark of just a hitachi drive), the other has to do with reducing the layers between the storage and its delivery. The third is how ultimately my SAN is going to be used.

Generally NASes run Linux, in most homes they serve data to Windows boxes. This means that your wire requests have to be translated to the native filesystem ( using Samba ), there is cost in doing that, and you need processor cycles to both serve the storage and serve the requests. A SAN doesn't do that - A SAN is like a Wankel engine, it doesn't need to translate the up & down into rotational energy, a SAN deals with SCSI requests on the sparse wire, which don't need translation. The DAS server maintains the filesystem, the SAN just serves up storage - the labor is nicely divided.

Depending on how you use your SAN this can be a big deal, you have a 4Gb pipe with little overhead (unlike TCP/IP), if I connect my SAN to my HTPC ( use my HTPC as my DAS server ) I can easily get over three plus times the performance of a NAS, when serving large video files ( rip & read ). This is how I'll be using my SAN.

The extra processor muscle of the DAS server, beyond that of fewer layers, means that it can serve the NAS requests about 10-20% faster than just the machine, Ol'Shuck, as a NAS.

This is the question about owning a Porsche versus a Ford Focus. Both provide transportation, and reliably get me to work and back. One does it with both performance and style, the other doesn't. My article shows how you can have a Porsche on the cheap. How compelling is a Porsche to you?

Now if the question is the second one, why not just share the SAN using gigabit networking via iSCSI to each node, that is a different question entirely. The reason why is that you can't... not easily anyway.

I'm using NTFS as the filesystem on my LUNs, which are served by the SAN. NTFS is not a shared filesystem, this means that if I want to share the block storage I need a different FS. With NTFS I can only provide the storage to one server, I can't share the storage throughout the network. To do that I need a shared block filesystem.

Right now, as far as I can tell, there is only one way to share storage with Windows from a SAN running free software ( there are a bunch of pay for solutions, HP and Oracle have two of the largest ). And that, as mentioned in another post, is via ESXi which provides a shared block filesystem ( single instance under the OS shared by each VM). I need to virtualize an Openfiler install for each client I want to share the storage with. For me that would be four VMs. (with the corresponding overhead). I plan on looking at that in another article/series, on multipathing (fc and iscsi) and shared iscsi.

Even in this scenario I probably wouldn't serve storage much faster than with my logical NAS, I'd incur the VM overhead, and the reduced payload of tcp/ip ( versus FC ), and it would be a pain to admin, new node? new VM - but it would be interesting to see.

Beyond all of these reasons, writing an article about building a NAS has been done, over and over. I'd really have nothing new to say. The new thing here is to show you can do high performance fibre channel on the cheap (which was not a sure thing when I started this series).

As to overkill, a question I anticipated, ever see The Exorcist in an altered state?

Peter · Jun 10, 2011

GregN said:
Not sure what the question here is, why not run the Old Shuck as a NAS? or Why not run iSCSI to each node instead of FC? or What is compelling about a SAN?

I think I deal with these in the article, but let me try to address them for you here, first rephrase the two possible questions you might have.

Question One, Why not run as a NAS, why go to the trouble of running a SAN?

Question Two, Why not run as a shared SAN using iSCSI instead of as fiber with a DAS?

Greg,
Thanks for that very thorough explanation. I can see that my very limited understanding of the low level differences between a NAS and a SAN made the question confusing. I guess my real question was number two, and the answer that a SAN doesn't really work that way is certainly a good reason not to do it like that. I was presuming that pushing the data from the SAN to the DAS at 4Gb, when it could only reach the end client at Gigabit speed was kind of pointless, but I now see there are other very valid reasons for doing it that way.

GregN · Jun 11, 2011

Windows and iSCSI

Peter said:
Greg,
Thanks for that very thorough explanation. I can see that my very limited understanding of the low level differences between a NAS and a SAN made the question confusing. I guess my real question was number two, and the answer that a SAN doesn't really work that way is certainly a good reason not to do it like that. I was presuming that pushing the data from the SAN to the DAS at 4Gb, when it could only reach the end client at Gigabit speed was kind of pointless, but I now see there are other very valid reasons for doing it that way.

When starting my project I made the same mistake, I thought there would be a way to multipath, both FC and iSCSI, one to my main desktop and one to the HTPC. Was bummed to find out that it wasn't easily done.

Turns out Windows is the issue here, if you want to do FreeBSD or Linux, maybe OpenSolaris there are open source shared filesystems, it is accessing them from Windows that is the tough nut. Windows does have DFS, but then the SAN would have to be running Windows too, and I'm not sure how that works.

Here is a W'Pedia article on file systems, they list several shared filesystems:

http://en.wikipedia.org/wiki/List_of_file_systems

There is also a sticky over on OF forums:
https://lists.openfiler.com/viewtopic.php?id=877

I've had no takers on ESXi, broaden the question, has anyone done this, set-up shared iSCSI access to Windows with open source/free software?

Another question which is not clear, if I put 10Gbe ( cards and switches, more expensive than fiber... ) in each node, and ran Ol'Shuck as a NAS would performance be significantly better? Same pump but bigger pipe - what I'm seeing now is that the Raid cards/disk are the bigger bottleneck

Ch'tio (france) · Jun 16, 2011

Thanks a lot for this impressive How To !

Probably stupid, but I was wondering if it was possible to access Old Shuck over FC from your Windows station, and over ethernet from other devices? Just to have BlackDog possible to power off.

Build Your Own Fibre Channel SAN

Senior Member

New Around Here

Senior Member

Occasional Visitor

Senior Member

Senior Member

Unregistered

Guest

Occasional Visitor

Senior Member

Occasional Visitor

Senior Member

Very Senior Member

Senior Member

New Around Here

Senior Member

Peter

Guest

Senior Member

Peter

Guest

Senior Member

Ch'tio (france)

Guest

Similar threads

Sign Up For SNBForums Daily Digest