Should you Trust the MTBF on a Disk Drive?

failed disk driveWhen you read the specifications on most modern disk drives, you’ll likely come across a statistic called MTBF. MTBF is the “Mean Time Between Failure” or the average time before the device is likely to fail.

Disk drives usually carry an MTBF of over 1 million hours. Given that there are 8,760 hours per year, I suppose using simple math, you could say that an ‘average’ drive should last 114 years (1,000,000/8,760).

Well, it’s pretty obvious from these numbers that nobody actually “tested” a drive to see if it would fail in 114 years…since disk drives haven’t even been around that long!

Rather, MTBF must be a theoretical calculation that an engineer “reasoned” based on some assumptions about the underlying component performance inside a drive.

The excerpts below from one recent study confirmed what we all know.

Customers replace disk drives at rates far higher than those suggested by the estimated mean time between failure (MTBF) supplied by drive vendors, according to a study of about 100,000 drives conducted by Carnegie Mellon University.

The Carnegie Mellon study examined large production systems, including high-performance computing sites and Internet services sites running SCSI, FC and SATA drives. The data sheets for those drives listed MTBF between 1 million to 1.5 million hours, which the study said should mean annual failure rates “of at most 0.88%.” However, the study showed typical annual replacement rates of between 2% and 4%, “and up to 13% observed on some systems.”

Ashish Nadkarni, a principal consultant at GlassHouse Technologies, a storage services provider in Framingham, Mass., said he isn’t surprised by the comparatively high replacement rates because of the difference between the “clean room” environment in which vendors test and the heat, dust, noise or vibrations in an actual data center.

The studies won’t change how Tom Dugan, director of technical services at Recovery Networks, a Philadelphia-based business continuity services provider, protects his data. “If they told me it was 100,000 hours, I’d still protect it the same way. If they told me if was 5 million hours I’d still protect it the same way. I have to assume every drive could fail.


Bottom Line: Don’t put too much trust in MTBF numbers. A 4% failure rate means that 1 in 25 hard drives will malfunction in the coming year. The physical drive will likely cost less than $200 to replace. However, the information on that drive could be critical to your business and may be irreplaceable.

Backup your critical business data every day, automatically, with Dr.Backup.

Excerpts from: Robert L. Scheier, Computerworld –