You might expect that a company that uses 27,134 of a thing might be a pretty fair judge of what makes those things good or bad. That’s what makes a recent series of blog posts by BackBlaze so interesting. Basically, adding to its side business of storage design, it now has a side business of storage hardware reviews.
As you may recall, the company’s MO, instead of using real real big storage, uses a whole whole lot of commodity storage devices hooked together into “pods,” with as much of the extraneous stuff stripped off as possible. This reduces costs and is more scalable than large storage systems that require forklift upgrades to be expandable. Companies such as Netflix, are using it as well, and several vendors have started selling storage systems based on the Backblaze designs. While the company occasionally has trouble finding commodity disk drives, in general the system it works pretty well.
While the reviews – three of them thus far, on expected drive lifetimes, drive reliability, and “Which hard drive should I buy?” – do have a weensy bit of a BackBlaze sales pitch in them, they’re also crammed full of good information, including charts and graphs.
“Why do we have the drives we have?” writes distinguished engineer Brian Beach. “Basically, we buy the least expensive drives that will work. When a new drive comes on the market that looks like it would work, and the price is good, we test a pod full and see how they perform. The new drives go through initial setup tests, a stress test, and then a couple weeks in production. (A couple of weeks is enough to fill the pod with data.) If things still look good, that drive goes on the buy list. When the price is right, we buy it.”
All in all, the review features 15 common models of hard drives, from vendors such as Hitachi, Western Digital, and Seagate. It doesn’t claim to be the be-all and end-all of storage hardware product reviews – simply ‘Of the ones we used, these were our results.’
And BackBlaze seems to do a pretty good job of tracking those results. “We have detailed day-by-day data about the drives in the Backblaze Storage Pods since mid-April of 2013,” writes Beach in his drive reliability blog post. “With 25,000 drives ranging in age from brand-new to over 4 years old, that’s enough data to slice the data in different ways and still get accurate failure rates. We have data that tracks every drive by serial number, which days it was running, and if/when it was replaced because it failed. We have logged 14719 drive-years on the consumer-grade drives in our Storage Pods, [and]
613 drives that failed and were replaced.”
In addition to the reviews themselves, BackBlaze allows people to comment on them, so there’s all sorts of hard-core storage wankery to read, if you’re into that sort of thing. (If you’re really into that kind of thing, check out the Slashdot writeup and those comments.)
Needless to say, some of the computer magazines and websites whose bread-and-butter is product reviews aren’t quite sure what to make of this. Naturally, the BackBlaze data – whether you agree with it or not – is way cool to any reviews nerd, but somebody who has 27,000 disk drives in their shop and full statistics on them can have a little more credibility than someone who’s testing a single device.
“We chronicle Backblaze’s failed attempt to provide credible HDD reliability data,” writes Paul Alcorn in TweakTown, who goes on to criticize the event as a publicity stunt and to pick at its methodology. “Read on to find out why you should pay no attention at all.”
“I wasn’t impressed last week when I saw Brian Beach’s blog on what disk drive to buy,” concurs Henry Newman in enterprisestorageforum.com, who criticized the blog post because it didn’t account for the different levels of I/O the drives might be experiencing. “I wasn’t impressed due to the lack of intellectual rigor in the analysis of the data he presented. In my opinion, clearly Beach has something else going on or lacks understanding of how disk drives and the disk drive market work.”
Others defended the BackBlaze blog post. “I understand a test engineer’s desire for controlled environments and workloads for testing,” counters Robin Harris in ZDNet, criticizing the TweakTown critique. “But that isn’t the real world: some drives are busier; some have higher ambient temps; some come from a bad run; or get banged around in shipment.” He goes on to say, “So yes, as a consumer, I would look at Backblaze’s results. If I were upgrading my arrays tomorrow, I’d make an extra effort to buy Hitachi per the Backblaze experience. What they found squares with what I’ve heard from insiders over the last 10 years.”
Information like this, from mega users, could certainly revamp the entire testing industry. (Similarly, the company took it upon itself to declare in November that the Thailand-flood-caused drive shortage was over, based on what it saw for its purchasing.) Consumer Reports, with its emphasis on real-world testing, has to be paying attention too. And as content marketing, it couldn’t be beat.
Now, what would be interesting is if some of the other companies that work by using huge quantities of commodity devices – such as Google or Facebook – followed suit with their information. Facebook is already revealing what it’s learned about server and storage design; it wouldn’t be much of a stretch for it to do reviews of them like BackBlaze is doing.
(It turns out that this is a point Harris also made. “But rather than bash Backblaze for giving consumers the benefit of their experience, TweakTown should be asking, as I do, for other major drive users to come clean,” he writes. “I’m looking at you, Google, Amazon and Microsoft.”)
Of course, so could the NSA, but they aren’t talking.
Disclaimer: I am a BackBlaze customer.