When to Replace Hard Drives

1 month ago 2

Joe, Jim, and Allan answered a listener question about hard drive longevity on the 2.5 Admins podcast recently, and my experience tracks with their points as well. The episode also made me wonder what my typical drive life span is, and when I consider swapping them out.

Hard drives have long been famous for their bathtub curve of reliability, in which they’re:

  1. Initially perceived as unstable or unreliable, resulting in warranty claims and replacements. It makes sense; if a drive is marginal from the factory, it’ll likely manifest early in the drive’s life.

  2. Reliable for many years, if it passes that initial spike.

  3. Gradually less stable and reliable once they cross a certian age threshold, until they eventually go kaput.

The gents talk about a five year lifecycle, especially for consumer drives. Drives may operate acceptably after this point, but they should be relegated to secondary duties or gradually swapped out, becuase they should be considered suspect. That tracks with our experience and recommendations at work as well, though we’ll come back to the residential angle.

That said, Allan also raised operating environments. Like him, we’ve had drives in arrays at work that have operated in ancillary roles since the company started in 2011, thanks to the clean power, lack of vibration, and cool air they’re fed in our respective data centres. Enterprise drives tend to be built in a more robust fashion, and if they’re running continuously with few mechanically taxing spin up/down cycles, most run indefinitely. Ironically, it’s then their commercial viability that ends up taking them offline… is a 400 GB SCSI drive worth the power its using in 2025?

Residential settings are tricker, because of the financial angle. Replacement drives, arrays, or entire servers are baked into business continuity plans and support contracts in commercial settings (or at least, should be), with the cost of the drive amortised over the perceived lifetime of the drives, or at least budgeted for when they go to that drive array in heaven. The drives running in our small homelabs and residential servers are much more expensive relatively speaking, and thus we’re more inclined to extend their useful lives and coddle them, rather than treat them as disposable.

The oldest drives in our home FreeBSD server are a pair of 8 TB WD Reds running in an OpenZFS mirror, which we bought in 2015. As the 2.5 Admins gents suggest, we use this as a secondary backup, and an overflow for our media server for material that would be easy(ish) to replace. Aka, not family photos or our Minecraft server! I am a little squeamish that they’re still spinning in that machine today, but some things give me confidence:

  1. I’m using OpenZFS on them, and have fortnightly scrubs scheduled. I would trust the corrected errors reported back from these scrubs more than SMART data, though I also run smartmontools as well.

  2. Like hostnames, I give each drive a partition label based on an anime character and/or Starfleet ship, appended with the serial number of the drive so I can reference it in /dev/gpt and easily identify it for diagnosis and potential replacement. This is all definitely (cough) important.

  3. They’re not in a data centre, but they’re still running continuously in a small homelab box that is taken care of. Aka, not kicked while sitting under the coffee table by accident. I hand carried the entire machine when we moved house.

  4. We have a specific envelope budget amount stashed for life emergencies, so in a pinch we can replace the drives.

I still need to get a proper UPS that can assist with delivering cleaner power to this box, but overall is an item of clothing. I will probably be looking at replacing these drives sooner rather than later though, because for all the talk about finances, we’re fortunate that we can place a higher value on our time and data. There’s value in peace of mind.

I should also point out that many of these concerns and constraints differ when we start talking about SSDs, which are more affected by their number of write cycles. SLC and enterprise-grade SSDs have higher write endurance, but you’ll still be wanting to monitor this activity more than you might a commercial hard drive you’ve bought for use at home.

So I guess to actually answer the question: when do you replace hard drives!? At work, we aim to cycle them out fairly often for customer production. At home, this is extended with more active monitoring and a file system you can trust—OpenZFS!—to keep an eye on issues, though a decade is likely pushing it. That’s just my 20c, adjusted for inflation.

Read Entire Article