We interviewed Fusion-io Inc. CTO David Flynn for one of our news stories today–here’s some nitty-gritty bonus footage on how the company’s product goes about protecting data, and how that compares to spinning-disk systems.
Beth: So one ioDrive is 320 GB. Is data striped across all the chips or do you have separate data sets?
Flynn: Each one of the Flash modules looks like a volume and you can either stripe them or mirror them to make them look like one volume. Or is you have multiple cards you can aggregate all of those volumes with RAID 10. We have RAID-5 like redundancy on the chips, then RAID between the memory modules. What we’ve come to realize after we introduced FlashBack is that it actually lets you get more capacity.
Most SSDs are 64 GB at most—32 GB, 64 GB. With this technology we put five to 10 times as many chips within our card. That would increase the failure rate because the individual chip’s failure rates add up. With our ability to compensate, we can get to higher capacities, and with that we can increase endurance, because you can spread the data out.
Internally it’s more like RAID 50 because I have eight die in my redundancy chip. There’s one parity die for each package. It’s 24+1 and then that quantity times eight, because there’s eight of those sets. If you were to line it up like disk drives, it would look exactly like that, 24 disk drives and then an extra one, 8 rows. So when we talk about this as a SAN in the palm of your hand we really mean it, because we’ve taken die within the various NAND packages and arrayed them together just like a disk array. It’s also self-healing in that if you have a fault the system reconstructs the data that otherwise might’ve gone missing and moves it to a different spot and turns off the use of the spot that failed. You don’t have to service it. It automatically just maps it out. Like Xiotech’s ISE product—that’s bleeding edge stuff for disk arrays, and it’s built into the silicon here.
What about double parity protection? That’s all the rage in the disk drive world these days. What if more than one die fails at once?
For us to rebuild and heal takes a split second. Having a second failure during that time is not going to happen. It takes so long to rebuild a disk drive—it can take more than a day now—that the probability of a double failure goes up. The other thing is that disk drive failures are often highly correlated—the drives come from the same batch. They tend to fail randomly but close to each other in time. Our portfolio does cover n+m redundancy as well as N+1 because we anticipate a day when we’re putting not hundreds of these die on the boards but thousands and going into the tens and hundreds of thousands.
At the same time the Flash memory has finite write endurance, so they are all going to wear out at some point. So how do you compensate for that?
We account for how many write cycles it’s been through so we can give somebody a running…like an odometer, for tread wear on a tire. You can go five years or 50,000 miles. We warranty it, and you can swap out the modules without needing a new carrier card. Because we have such high capacity we naturally get a longer lifespan. It’ll last for 5 years even if you’re doing nothing but writing constantly. Wear-out has been overrated I think because most of the failures people are seeing have nothing to do with wear-out, they have to do with internal events that cause chips to lose data.
Here’s the four factors. This is the dirty little secret of the NAND world—it’s the newest fab process, which means it has its kinks. It’s the tightest feature size—they’re going to 32 nm. The density of the array of cells is achieved by sharing control lines. And then, fourth, and the real killer, to move the electrons into the floating gate cell it takes 20 volts internally. Most core voltages are well under a volt nowadays.
These four factors mean having a short-out event on one of these tiny little control lines—if you have just one chip it’s no big deal, it’s 40 out of a million. Which for a thumb drive, nobody would notice—it’s more likely to get shorted out in your pocket. But when you put hundreds of them together, now you have hundreds of those 40 out of a million chances to have something go bad, and that actually adds up to be something like one or two percent of these things fielded would have a data loss event. For a normal SSD the way they compensate is to put fewer chips on it or try to sweep the problem under the carpet—what they say if you talk to them is, ‘Well, we screen it very well, we run it in advance to make sure it’s not going to happen.’ You can screen it up front but there’s still probabilities of failure.
Here’s the thing: disk drives wear out, too. The trouble is, it’s unpredictable. One of the strongest motivators to going to solid state technology is the predictability of when you’re going to need to service it. And after a couple of years, you’re going to be able to replace it for a fraction of what it cost initially.
Not a month after an Israeli news source reported that EMC Corp. had been under investigation concerning government contracts in Israel, EMC revealed in its annual report filed with the SEC that it’s under investigation by the Civil Division of Department of Justice (DOJ). The DOJ investigation involves “allegations concerning (i) EMC’s fee arrangements with systems integrators and other partners in federal government transactions, and (ii) EMC’s compliance with the terms and conditions of certain agreements pursuant to which we sold products and services to the federal government, including potential violations of the False Claims Act.”
There’s no relation to the Israeli investigation, according to an EMC spokesperson. In another contrast with that case, in which EMC flatly denied comment, this time the company is flatly denying any wrongdoing will be found by the DOJ. “EMC did not make improper payments to business partners and did not violate the False Claims Act,” wrote the spokesperson in an email to SearchStorage.com. “The matters at issue in this case are historical in nature; some of the allegations relate to events nearly ten years old. We will vigorously defend this case and the many years EMC has spent serving the U.S. Government…”
The SEC filing reads,
The subject matter of this investigation also overlaps with that of a previous audit by the U.S. General Services Administration (“GSA”) concerning our recordkeeping and pricing practices under a schedule agreement we entered into with GSA in November 1999 which, following several extensions, expired in June 2007. We have cooperated with both the audit and the DoJ investigation, voluntarily providing documents and information, and have engaged in discussions aimed at resolving this matter without any admission or finding of liability on the part of EMC.
Storage vendors are announcing new deals in an effort to make their enterprise goods more tempting amid slashed storage budgets. Today, HP confirmed it is extending a 0% financing deal it had previously been offering with its servers to storage.
According to an HP spokesperson, the HP storage products included in this program are:
The move comes after HP reported double-digit revenue declines over most of its lines of business for its first fiscal quarter. The Enterprise Storage and Servers (ESS) group was no exception, with revenue of $3.9 billion, down 18%. Within that, storage revenue fell 7%; overall profit for the group was also down 14%.
HP joined NetApp in reporting earnings declines in a fiscal quarter that included January. (Interesting aside: Dell reported that its storage business, especially its low-end PowerVaults and EqualLogic midrange iSCSI SANs, did relatively well for its first fiscal 2009 quarter, with business up 7% though overall earnings slipped).
But in a recession this deep, some federal interest rates have also been cut to zero in the hopes of getting business moving again. Housing prices are so depressed that theoretically, they should be affordable to a whole new class of buyers. But neither of those things–and so far, all the King’s horses and all the King’s men–haven’t done much for the markets, if only because everyone who still has a job is so afraid they’ll lose it by the end of this year that they aren’t spending, no matter how good the deal is.
Many enterprise storage users seem to be in a similar boat–these financing deals, like low home prices, would be irresistible in better times. Ironically, in bad times, they may not be enough.
Data Domain is bumping up its deduplication speed with an operating system upgrade.
Moving from OS 4.5 to 4.6 will improve the speed of Data Domain systems from 50% to 100% depending on the protocol and network interface, according to the vendor’s VP of product management Brian Biles. The greater speed comes from code tweaks in the OS that lets multi-core CPUs support more parallel streams.
The improvement with OS 4.6 is greatest for systems running 10-Gigabit Ethernet and Symantec’s NetBackup OpenStorage (OST) interface. For instance, max performance for the DD690 – Data Domain’s largest system – goes from 1.4 TB per hour to 2.7 TB per hour for a 90% increase using the new OS, according to Data Domain’s estimates. That’s with 10-GigE and OST.
Is speed all that important for dedupe? Throughput often gets lost in the debate over dedupe ratios and the inline versus post-processing argument, but analysts and customers say speed is a major selling point for dedupe systems. Speed is plays a big role in Data Domain’s inline deduping, which risks slowing backups because it dedupes while backups are taking place.
“Faster equals more data processed, which equals more data reduction,” Enterprise Strategy Group analyst Brian Babineau says. “Performance improvement is a means to other benefits, including storing more data in smaller footprint.”
Rich VanLare, Network Administrator for shopping center developer Regency Centers, has been using a Data Domain DD690 with NetBackup and OST since last October and was blown away by the speed with OS 4.5. VanLare says his goal was to decrease backups to below nine hours, and is down to five hours since replacing tape with the DD690. VanLare says he’ll upgrade to 4.6, but he’s happy with his current system.
“The box is incredibly fast to begin with,” VanLare said. “Personally I don’t need it [improved speed] because it’s already exceeded my expectations.”
VanLare, who claims to get more than 90 percent compression, said the OST option was the main factor he choose Data Domain over VTLs from NetApp and Overland Storage.
“I have a lot of administrators getting into the interface, and I just wanted things to be simple,” he said. “OST tells an administrator exactly what happens if something fails.”
VanLare biggest concern with Data Domain is he won’t be able to add a second box at a DR site until the economy improves. “I wanted to do that this year,” he said. “With budgets as they are, I’m not sure that’s going to be approved.”
Here are some stories you may have missed this week:
As always, you can find the latest storage news, trends and analysis at http://searchstorage.com/news.
Two online data sharing services failed this week — one from a computing giant, and the other a small social bookmarking website.
That’s the trouble in this wild and wooly world of the cloud–especially in its early days. Not every service is going to make it, and then you’re going to have to figure out what to do with your data if your service fails.
Hewlett-Packard pulled the plug on HP Upline, and according to our Australian affiliate, ma.gnolia went under. SearchStorage ANZ reports that “in late January, ma.gnolia experienced a catastrophic data loss event and turned to backups to restore its database of users’ bookmarks. Both the primary and secondary backups failed irrevocably.”
Said a friend of mine who’s a Digg addict (I’m more a del.icio.us woman myself), “Losing my bookmarks would *hurt*.”
In the case of HP’s Upline online backup service, users will at least be able to get their data back. HP confirmed this afternoon will be discontinued as of March 31. In a statement, an HP spokesperson said:
HP continually evaluates product lines and has decided to discontinue the HP Upline service on March 31, 2009.
HP will no longer be backing up customer files to the HP Upline servers as of Feb 26, 2009 at 8 am Pacific time. HP will keep the file restore feature of the Upline service operational through March 31, 2009 Pacific time in order for customers to download any files that have been backed up to Upline.
Blogger AppScout wrote disappointedly, “And so goes the story of one of the slickest online storage and backup services to launch in the past year.” Among Upline’s unique features was the ability for users to tag content for later search and share, and to publish files online using the service through a feature called the Upline Library. However, Upline crashed right out of the gate, drawing opportunistic marketing for competitors.
There are lots of interesting donnybrooks going on in this industry at any one moment, but EMC-NetApp is like the Red Sox-Yankees rivalry: imbued with a sense of historical inevitability, and capable of reaching heretofore undiscovered levels of bickering.
The latest series of skirmishes takes place on one of the most hotly contested battlefields of storage today–VMware. Specifically, the integration with, support of, and general glomming on to the server virtualziation giant’s software.
There was a time when I would’ve guessed NetApp was the most-installed storage system with VMware, especially as VMware over NFS took hold at least in some enterprise shops. Server administrators and application admins were already familiar with running databases over NFS, and NetApp’s NFS was generally considered the best. Plus, NetApp has the whole multiprotocol thing going on with iSCSI and NFS in the same system, built in data protection tools, etc.
Not so, says EMC, triumphantly waving a newly released report from Forrester Research:
EMC…has been cited as “the most prevalent storage vendor in their overall environments” according to a survey of 124 global IT decision-makers currently using x86 server virtualization technology. The January 2009 report titled Storage Choices For Virtual Server Environments also revealed that 98 percent of the 124 survey respondents were using VMware ESX in their virtual server environments, and that 78 percent have virtual server technology in use for production application workloads.
According to the survey, 48 percent of respondents chose EMC as their brand of networked storage for virtual servers – nearly two times as many as the next closest vendor, IBM. Additionally, 63 percent of the respondents prefer to buy from a single storage vendor, which illustrates that buyers show a preference for working with a single storage vendor.
Furthermore, the report states:
There is little correlation between vendor and protocol selection. Surprisingly, there is not
a strong pattern linking the choice of vendor to the preferred protocol. Even NetApp, with its
strong heritage in file storage and ability to offer in-depth best practices for NFS in virtual server
environments, still shows a prevalence of FC — NFS is the least common option. This is due
to the following: 1) VMware did not add support for iSCSI and NFS until ESX Server 3.0; 2)
storage vendors are generally protocol-agnostic — they support and recommend all available
protocols; and 3) customers are often unwilling to diverge from what they know and use already.
This is also the case when it comes to storage vendors in general–companies generally don’t buy new storage systems or try new technologies just for to support VMware, the report finds. More than half (53%) of the users surveyed were using EMC. That’s a much higher EMC sample than the 28% networked storage market share EMC had in the most recent IDC’s quarterly storage report.
Besides the disproportionately high number of EMC customers surveyed and relatively small overall sample of 124 users, the Forrester report also discloses that “in terms of industry, financial services and insurance is the most prevalent, with 41% of respondents.” Both of those verticals tend to favor EMC.
According to Forrester’s survey, 25% were using IBM with virutal servers followed by NetApp at 24% and Hewlett-Packard at 23%. I’m curious how many of the systems counted as IBM in the study are N-Series, which is NetApp under the covers.
Meanwhile, NetApp is not pulling any punches, continuing its VMware space-efficiency guarantee, this time extending it, as it did its primary storage data deduplication capabilities, to V-Series. That means it’s essentially offering a guarantee on third party storage from EMC, IBM, HP and Hitachi Data Systems fronted by a NetApp head.
Not everybody in the industry was impressed with the VMware guarantee. The guarantee is highly conditional, making it highly unlikely that NetApp would ever have to pay off.
As for applying NetApp services to third-party storage, the Forrester report seems to suggest that most users tend not to have more than one vendor in their environment, let alone attaching one vendor’s system to another’s. Can you imagine the finger-pointing if there were an issue?
So. We’re left with a relatively-small-sample-size report skewed toward one vendor on one side, and hollow guarantees on the other.
Oh! And one spoof of the battle scenes in 8 Mile that has to be seen to be believed.
There’s still a lot more ground to be gained and lost this year in the VMware marketplace, and who knows what events might come along to change the industry completely. In the meantime, though, we know there’ll never be a dull moment between the notorious NTAP and E-squared.
While one vendor’s blogger came to bury SPEC SFS, another came to defend it. The clash of vendors as yet seems unresolved.
The Standard Performance Evaluation Corporation (SPEC) SFS benchmark measures file server throughput and response time. The latest version, SPECsfs2008 was implemented last year.
But Sun FISHWorks blogger Bryan Cantrill wrote in a post called “Eulogy for a Benchmark” that the workload mix even in the most recent version remains outdated:
The 2008 reaffirmation of the decades-old workload is, according to SPEC, “based on recent data collected by SFS committee members from thousands of real NFS servers operating at customer sites.” SPEC leaves unspoken the uncanny coincidence that the “recent data” pointed to an identical read/write mix as that survey of…now-extinct Auspex dinosaurs a decade ago — plus ça change, apparently!
Moreover, Cantrill argued, the testing parameters for systems lead vendors to design NAS heads to perform well in the SFS test, which he said is at best irrelevant and at worst detrimental to a real-world environment. He also insists that SPEC benchmark results need to come with system pricing disclosures.
Enter NetApp blogger and senior technical director Michael Eisler, who called his response to Cantrill’s post “Chuckle for Today.”
the philosophy of SPEC SFS has always been to model reality as opposed to the idealist…dream where a storage device never has to process a request. P.S., in an earlier blog post, I made the argument that SPEC SFS 2008′s differences from SPEC SFS 3.0, show the caching on NFS clients has improved.
On the pricing disclosure issue:
Like many industries, few storage companies have fixed pricing. As much as heads of sales departments would prefer to charge the same highest price to every customer, it isn’t going to happen. Storage is a buyers’ market. And for storage devices that serve NFS and now CIFS, the easily accessible numbers on spec.org are yet another tool for buyers. I just don’t understand why a storage vendor would advocate removing that tool.
In storage, the cost of the components to build the device falls continuously. Just as our customers have a buyers’ market, we storage vendors are buyers of components from our suppliers and also enjoy a buyers’ market. Re-submitting numbers after a hunk of sheet metal declines in price is silly.
This is where Cantrill appears to take exception to Eisler’s taking exception, responding in a followup post that Eisler’s defense of the pricing non-disclosure is an “Alice-in-Wonderland defense.”
Mike’s argument — and I’m still not sure that I’m parsing it correctly — appears to be that the infamously opaque pricing in the storage business somehow helps customers because they don’t have to pay a single “highest price”! That is, that the lack of transparent pricing somehow reflects the “buyers’ market” in storage. If that is indeed Mike’s argument, someone should let the buyers know how great they have it — those silly buyers don’t seem to realize that the endless haggling over software licensing and support contracts is for them!
It’s not just this benchmark which is being debated over in the storage industry–SPC benchmarks have also been a bone of contention between EMC and NetApp and between HP and EMC. Even in the comments on this blog I’ve heard everything from “Take the time to read the full disclosures, read the specifications…You might learn something” from a defender of SPC to a nonplussed “I really hope nobody uses SPC-1 results as any criteria for buying storage.”
‘zilla emphasizes that the Networker VSA is for demo purposes only, going so far as to say “Thou shalt not use this for production backups.”
But the curious part is EMC also offers a VSA for its Avamar ROBO/dedupe software that is meant for use in production.
I know that there are big differences between Avamar and Networker, especially in scale. Performance can also limit scalability in virtual appliances. But other companies have offered VSAs using scaled-back versions of software for use in smaller environments, similar to Networker Fast Start (at least according to how it’s described on the product page).
Update:‘zilla let me know that you *can* run Networker in production on a VM, just not this particular time-limited VM.
EMC also has a not-for-production Celerra VSA. ‘zilla encourages a combo of the Networker and Celerra VSAs for a “NetWorker Advanced File Type Device.” But that device would still be not-for-production.
FalconStor basically calls this kind of configuration the Network Storage Server (NSS) and it’s available as a virtual appliance, very much for production use. EMC could have a competitor here with Networker and Celerra VSAs, but discourages their use in production. I’m not sure what to make of that.
In the meantime, there are more VSAs on the market now, for production use or otherwise, than you can shake a stick at. User/blogger Martin Glassborow (StorageBod) is putting several through their testing paces over at his place.
Here are some stories you may have missed this week:
As always, you can find the latest storage news, trends and analysis at http://searchstorage.com/news.