Testing New Hard Drives

Shelbrain · May 16, 2015

Hello,
When you guys purchase new hard drives for your NAS or servers what are you using to test the hard drives for errors? Do you use tools built into Windows, Linux or Mac? Recently, I purchased a 6TB WD Red drive to put into my home built server and was doing some research on what the best method is for testing the drives for errors within Newegg's return window.

Right now, the solution I am using is called stressdisk https://github.com/ncw/stressdisk. Is anyone familiar with this software? It has been running for about 15 hours and so far no errors located.
https://github.com/ncw/stressdisk
I would love to hear what everyone else is using! Thanks!

L&LD · May 16, 2015

I run the QNAP tests continuously every day (quick and thorough), for as long as I can hold off putting any important data on new drives (sometimes, that can be up to a month). That tests a new NAS box and the drives together and has been great at detecting problem drives.

A HDD test that is completed in a matter of hours rather than days is not something I would trust to put into service for myself or my customers. While a month may seem overkill to test a new drive for, it depends on how important the data that will be transferred to it is and how important to have that data available going forward too.

sfx2000 · May 16, 2015

When I install new drives, normally I'll let them soak for 48-72 hours before putting them into service - most drives will either die out of the box, within the first couple of days, or they'll run their service life...

I do monitor SMART stats, and if I see predictive failures, I'll replace the drive...

L&LD · May 16, 2015

sfx2000 said:
When I install new drives, normally I'll let them soak for 48-72 hours before putting them into service - most drives will either die out of the box, within the first couple of days, or they'll run their service life...

I do monitor SMART stats, and if I see predictive failures, I'll replace the drive...

I like that term 'soak'.

Will have to borrow it when appropriate.

Shelbrain · May 16, 2015

L&LD said:
I run the QNAP tests continuously every day (quick and thorough), for as long as I can hold off putting any important data on new drives (sometimes, that can be up to a month). That tests a new NAS box and the drives together and has been great at detecting problem drives.

A HDD test that is completed in a matter of hours rather than days is not something I would trust to put into service for myself or my customers. While a month may seem overkill to test a new drive for, it depends on how important the data that will be transferred to it is and how important to have that data available going forward too.

Are the QNAP tests available to people that don't own the hardware? I would love to try them on my hard drive if they are available somewhere on the Internet. I didn't indicate that the test was completed after 15 hours, in fact it is still running at the moment and I am not sure when it will be completed. I wanted to validate my results using more than one piece of software to test the drive so it may be several weeks before the drive is put into operation.

Shelbrain · May 16, 2015

sfx2000 said:
When I install new drives, normally I'll let them soak for 48-72 hours before putting them into service - most drives will either die out of the box, within the first couple of days, or they'll run their service life...

I do monitor SMART stats, and if I see predictive failures, I'll replace the drive...

I am not familiar with the lingo, is there a specific process/procedure you follow to "soak" a new hard drive before putting it into operation?

sfx2000 · May 16, 2015

plug it in, do a format, lay on a filesystem, and let it sit running before doing any additional work... most drives will, like I mentioned earlier, fail upon first use, or die shortly after...

It's like letting the pots and pans "soak" in the sink to soften up the the dried up/cooked on stuff...

No need to "stress" a drive, e.g. burn it in, these days... that just puts additional wear on the drive.

sfx2000 · May 16, 2015

And if you really do want to do an end-to-end check of the drive, and you're running linux...

badblocks can run a destructive read/write test across the entire disk...

# badblocks -c <NUMBER_BLOCKS> -wsv /dev/sdX

or use shred... if shred fails, the drive is likely bad.

# shred --verbose --random-source=/dev/urandom -n1 /dev/sdX

This will write a block of random data across the entire disk...

just make sure you're pointed at the right disk, or you'll be in for a heck of a suprise...

L&LD · May 17, 2015

Shelbrain said:
Are the QNAP tests available to people that don't own the hardware? I would love to try them on my hard drive if they are available somewhere on the Internet. I didn't indicate that the test was completed after 15 hours, in fact it is still running at the moment and I am not sure when it will be completed. I wanted to validate my results using more than one piece of software to test the drive so it may be several weeks before the drive is put into operation.

Without having a QNAP NAS, the tests are not available as far as I know.

Shelbrain · May 17, 2015

Ok thanks for the advice. This is the first non SSD drive I've bought in awhile and I wanted to make sure that it was okay. It will replace an older 1.5tb Western Digital green drive that is still running after nine years in my computer. I will be getting a second WD Red when funds permit for redundancy. I'm amazed at the amount of storage available now. It will be used for Plex, two IP cameras, Calibre and backup other folders on the network

Russ64 · May 17, 2015

I am running the Asustor built-in Bad Sector Check on each new disk as this runs each disk at full speed for ~4 hours and I can monitor disk and NAS temps then decide to keep or return suspect disks.
As you do not have a NAS you can try WD tools from http://support.wdc.com/product/download.asp?wdc_lang=en&fid=wdsfNas_Red

stevech · May 17, 2015

I agree put a few days's time on a new drive before using in a NAS.
It's the "Bathtub curve failure" syndrome: failures tend to occur early or late in product lifetime.

Terry Kennedy · May 22, 2015

Shelbrain said:
When you guys purchase new hard drives for your NAS or servers what are you using to test the hard drives for errors? Do you use tools built into Windows, Linux or Mac? Recently, I purchased a 6TB WD Red drive to put into my home built server and was doing some research on what the best method is for testing the drives for errors within Newegg's return window.

Probably the biggest contributor to out-of-the-box and infant mortality failures on drives is improper packaging / handling in the chain between the manufacturer and you. Some sellers just throw naked drives (hopefully in antistatic bags) in a box with some pillow-pak or similar stuff and hope it makes it to you in one piece.

Manufacturers package drives in one of three ways:

"Retail" or "white box" packaging. Normally includes foam or cardboard inserts to hold the drive in the box. May come with additional items (jumpers, installation guides, etc.) not found in bulk or mega bulk packaging.
Bulk packaging for distributors (such as Tech Data). You buy a box of 20 drives and you get drives with close-to-consecutive serial numbers, packaged in a nice shipping box. You probably don't get any of the extras mentioned above - the manufacturer assumes you're a system integrator.
"Mega" bulk packaging. For shipment from the manufacturer direct to OEM customers like HP, Dell, etc. Usually comes as a shipping container full of drives (drive packaging varies).

Where you run into problems as an end user is when you purchase what you think is a retail / white box drive, but what you actually get is a drive pulled from bulk packaging.

Regardless of where you got your drive(s), you should confirm the warranty status and warranty end date once you have your drive's serial numbers. Some manufacturers sell drives with a variety of warranty options, and the warranty was ticking away between when the drive left the manufacturer and when it got to you. If you purchased through an authorized distributor, you should be able to have the manufacturer update their records to show the warranty starting the day you received the drive(s).

Back to testing - some drives offer a SMART "conveyance test" specifically designed to look for errors introduced during shipping. Other than that, I'd suggest a SMART long test followed by some full-disk reads and writes.

If a drive doesn't fail within a few weeks, it likely won't fail until old age. I occasionally purchase used drives (mostly 15K SAS drives for Dell servers) and I've had drives come in with over 5 years of power-on hours and still have zero grown defects.

azazel1024 · May 22, 2015

sfx2000 said:
plug it in, do a format, lay on a filesystem, and let it sit running before doing any additional work... most drives will, like I mentioned earlier, fail upon first use, or die shortly after...

It's like letting the pots and pans "soak" in the sink to soften up the the dried up/cooked on stuff...

No need to "stress" a drive, e.g. burn it in, these days... that just puts additional wear on the drive.

Agreed. Only thing on top of that that I do is check the SMART data for the drive and I'll typically run a drive short test in a SMART tool, or Seagate has a pretty good HDD tool that does a good short and long drive test. I'll then load up whatever data I was going to load up on the drive and recheck the SMART data.

Then call it a day. I'll periodically check the SMART data maybe monthly. Since I run my HDDs in arrays, Intel's RAID utility will report to me if a drive reports a SMART failure (because it'll mark the array as compromised until you unmark it). Most issues that are going to be creeping death for a drive are going to be seen in SMART data and will also trigger Intel RAID utility to make the array as bad. This includes errors that do not mean that the drive is dead, but starting to spiral in on being dead. Like Pending Sector Errors (where the drive failed to read the data from a sector, but it isn't marked as bad yet. Not until it fails the read a certainly number of times for that sector will it mark the sector as bad and then bring a reserved sector online).

sfx2000 · May 22, 2015

azazel1024 said:
Agreed. Only thing on top of that that I do is check the SMART data for the drive and I'll typically run a drive short test in a SMART tool, or Seagate has a pretty good HDD tool that does a good short and long drive test. I'll then load up whatever data I was going to load up on the drive and recheck the SMART data.

That's always the SMART thing to do - pardon the pun

In the data centers, most servers have a service processor (Integrated Light Out Management for example, or ILO for short), and as part of that, we run an IMPI daemon that checks HW status on a periodic basis... so predictive failures are pretty easy to catch, as long as someone is monitoring things..

Shelbrain · May 24, 2015

Thanks guys! Drive passed all tests and SMART data looks good.

azazel1024 · May 26, 2015

My ILO service processor is taping on the box and seeing if the magic smoke has been released yet or not

Otherwise it is whatever I dig up manually, sadly.

I did have an interesting drive the other day where my machine would refuse to POST with it attached. That one went straight back.

L&LD · May 26, 2015

azazel1024 said:
My ILO service processor is taping on the box and seeing if the magic smoke has been released yet or not

Otherwise it is whatever I dig up manually, sadly.

I did have an interesting drive the other day where my machine would refuse to POST with it attached. That one went straight back.

Huh?

azazel1024 · May 26, 2015

Which part? I am joking in the first. The middle one is pretty self explanitory. The final, kind of the same. I had a HDD that when hooked up, my machine would not POST.

sfx2000 · Jun 4, 2015

azazel1024 said:
Then call it a day. I'll periodically check the SMART data maybe monthly. Since I run my HDDs in arrays, Intel's RAID utility will report to me if a drive reports a SMART failure (because it'll mark the array as compromised until you unmark it). Most issues that are going to be creeping death for a drive are going to be seen in SMART data and will also trigger Intel RAID utility to make the array as bad. This includes errors that do not mean that the drive is dead, but starting to spiral in on being dead. Like Pending Sector Errors (where the drive failed to read the data from a sector, but it isn't marked as bad yet. Not until it fails the read a certainly number of times for that sector will it mark the sector as bad and then bring a reserved sector online).

SMART tools are handy - and some report more than others... for Linux, the smartmontools package is very handy, esp. for DIY NAS builders - it may not part of the typical default install, but it's available as packages for both RedHat and Debian (and Ubuntu).

For Macs, it's available over in homebrew, and I suspect also in MacPorts...

If smartctl reports a predictive failure, be proactive and replace the drive - I had this happen recently on a MacMini Server, and as luck would have it, it was the top bay, which is a complete disassemble - so will be replacing both drives (drives are 4 years old plus, and 7200RPM laptop drives get pretty warm) as a precautionary move.

sfx

Testing New Hard Drives

Occasional Visitor

Part of the Furniture

Part of the Furniture

Part of the Furniture

Occasional Visitor

Occasional Visitor

Part of the Furniture

Part of the Furniture

Part of the Furniture

Occasional Visitor

Occasional Visitor

Part of the Furniture

Regular Contributor

Very Senior Member

Part of the Furniture

Occasional Visitor

Very Senior Member

Part of the Furniture

Very Senior Member

Part of the Furniture

Similar threads

Support SNBForums w/ Amazon

Sign Up For SNBForums Daily Digest