Posted by: Beth Pariseau
storage technology research, Strategic storage vendors
The storage market is a vibrant one right now, and as social networking concepts like blogs become more popular in the corporate world, the storage industry has a lively, varied blogosphere to match. Below are examples of some of the more interesting commentary I’ve seen lately, in case you missed them.
HDS user hungers for Flash
The blogs I particularly enjoy are those from users who are willing to be outspoken and open about their opinions on technology and vendors’ products. One of the most insightful user blogs out there is Ruptured Monkey, written by a team of a handful of storage admins.
A post on Ruptured Monkey that caught my eye recently was a post by HDS user Nigel Poulton on solid state drives. Some of you out there may remember another post on this blog, “If EMC releases solid state drives in a forest…” that sprang from a followup interview I had with HDS Chief Scientist Claus Mikkelsen after EMC made solid-state drives available for the Symmetrix. Mikkelsen’s statement was essentially that there’s no market for solid state drives right now, but if EMC has created the market, HDS “will just jump right in.”
At the time, I also spoke with a user at a big EMC shop who, I figured, given his company’s size and need for performance, would be among the most likely customers to buy in to EMC’s new offering. Eventually, he said, but not right now–solid state drives are just too expensive.
But it might be time for HDS to allocate a few engineering resources to rolling out SSDs. Tape and removable-disk maker Imation saw fit this past week to get into this market with the release of two new SSD appliances; Intel is also reportedly planning a high-capacity SSD. And then I found the post on Ruptured Monkey from Poulton, who really wishes HDS would get on the SSD bandwagon.
You have to wonder what the disk manufacturers and array vendors think of the possible explosion of SSD into the mainstream? Are they eyeing up the current crop of SSD companies with the view to buying some of them up? Surely Seagate will have half an eye on this!
May be now is a good time to buy shares in some of these SSD players!?
So I guess now that Ive said all of that, only one small thing remains – Hitachi to announce support for SSD in the USP-V. Otherwise I will just have watch the rest with wanton eyes.
The post previous to that one on Ruptured Monkey is also worth a gander. Title: Is EMC developing controller-based virtualization?
Data Domain and active archiving
When Data Domain updated its OS to optimize its products for nearline storage, my expectation was that it would primarily function as nearline NAS for less-critical file shares. But I’m starting to hear about how Data Domain is being used for active archiving of emails for e-Discovery initiatives. That’s not the direction I expected users to take it, although it makes sense now that I think about it.
Meanwhile, everyone’s favorite backup guru, W. Curtis Preston, seems to have seen this coming a mile away. I just recently came upon his blog post from last year around the time of the announcement, and I think his observations then are still relevant. Especially interesting is what he points out about Data Domain and CAS:
A CAS system assigns an address to each stored object based on its content. In more technical terms, an object is addressed by a 126-bit MD5 hash or 160-bit SHA-1 hash that is created using an algorithm run against its contents. If two files are exactly the same, they will get the same hash and will be stored only once. If they differ by even one bit, they will not have the same hash and will both be stored in a CAS system.
Data Domain takes this a bit further and attempt to identify redundant data at the sub-file level. This means that if several versions of the same file are all sent to the same Data Domain system, it will identify the blocks/pieces/fragments of the files that are the same (and store them only once) and the blocks that are unique to each version, storing them as well. Depending on the application, this could result in signifcant space and cost savings. In addition, since Data Domain systems were originally engineered to meet the demanding throughput requirements of backup systems, they should be able to provide higher performance than today’s typical CAS system. (I speak only theoretically. I have not tested their new archive OS against a CAS system. I’ll be curious to see what happens out there.)
Like CAS systems, you’ll be able to replicate a Data Domain system to another Data Domain system for offsite protection. They have also been certified with leading archive software vendors. (It’s a lot easier to qualify your system when you have a filesystem interface, as opposed to the API that a certain CAS system requires you to program to. <ahem>
NetApp whitepapers on storage subsystem failures
NetApp’s rebranding has it trying to downplay its geekitude a bit, but StorageMojo’s Robin Harris posted recently about some interesting whitepapers it released at FAST on storage subsystem failures. These were in response to a challenge from Harris for storage vendors to respond to last year’s well-publicized research from Google and Carnegie Mellon University on disk-drive failures. NetApp’s response showed some interesting results, according to Harris:
The cynical, myself among them, might be tempted to dismiss the work as exercise in self-justification. The studies find disk scrubbing useful in eliminating silent data corruption, a result any half-awake SE will use to their advantage.
But in Parity Lost and Parity Regained – nice Milton reference! – they also found that disk scrubbing could spread an error – parity pollution – across multiple disks. In fact,
. . . the tendency of scrubs to pollute parity increases the chances of data loss when only one error occurs.
This is honest research, following the data where ever it goes. It is the difference between science and spin.
Also interesting was Harris’s comparison between different vendor reactions to his challenge:
Working through the weekend, NetApp’s Val Bercovici [responded]. IBM did so a little later. EMC said semi-nothing.
Two weeks later a not-very-bright EMC’er sent an EMC lawyer to shut StorageMojo up. Some people are so-o-o sensitive.