With data deduplication in the news today, I recommend checking out the responses to Jon Toigo’s questionnaire for data deduplication vendors. I found his questions about backing up deduped data to tape and the potential legal ramifications of changing data through dedupe especially interesting. The responses from the vendors so far about hardware-based hashing are also interesting, in that they seem to break down according to whether or not their companies offer a hardware- or software-based product.
It would be pretty disappointing if Hifn’s announcement of hardware-based hashing led to a religious war around software- vs. hardware-based dedupe systems. It’s clear (and has been generally accepted, or so I thought) that hardware performs better than software, meaning it’s in users’ best interest to improve the throughput of data deduplication systems by moving processor-intensive calculations to hardware. And the dedupe market is full of enough FUD as it is.
Speaking of which, Data Domain and EMC are getting all slapper-fight about dedupe thanks to today’s product announcement from Data Domain (and attendant comparisons to EMC/Avamar), and the fact that EMC is planning to finally roll out deduping tape libraries at EMC World (based on Quantum’s dedupe).
EMC blogger Storagezilla calls the statement by DD in a press release that its new product is 17 times faster than Avamar’s RAIN grid “nose gold” (props for the phraseology, at least), and then points out that Avamar’s back end doesn’t actually do any deduping, which is something I still don’t quite get.
So Data Domain’s box is faster at de-dup than the Avamar back end which doesn’t do any de-dup.
Since the de-dup is host based and only globally unique data leaves the NIC do I get to count the aggregate de-dup performance of all the hosts being backed up?
Yes, I do!
How does Avamar decide what data is ‘globally unique’? If this is determined before data leaves the host, than that processing must be done at the host. ‘Zilla even says he can count the aggregate performance of all the hosts being backed up in the dedupe performance equation. . .which brings me back to the first point again: Avamar’s back end doesn’t do de-dupe, but it’s faster at dedupe than Data Domain anyway?
Chris Mellor explored this further:
Accrding to EMC, Avamar moves data at 10 GB/hr per node (moving unique sub-file data only). Avamar reduces typical file system data by 99.7 percent or more, so only 0.3 percent is moved daily in comparison to the amount that Data Domain has to move in conjunction with traditional backup software. This equals a 333x reduction compared to a traditional full backup (Avamar has customer data indicating as much as 500X, but 333X is a good average).
‘An EMC spokesperson’ (should we assume it was, or wasn’t, Storagezilla himself?) further stated to Mellor:
“Remember that Data Domain has to move all of the data to the box, so naturally they’re focusing on getting massive amounts of data in quickly. EMC Avamar never has to move all of that data, so instead we focus on de-dupe efficiency, high-availability and ease of restore. Attributes that are more meaningful to the customer concerned with effective backup operations. “
Again I ask, where does the determination that data is ‘globally unique’ take place? It’s got to be taking up processor cycles somewhere. The rate at which it makes those determinations, and where it makes those determinations, would be the apples-to-apples comparison with DD, which is making those calculations as data is fed into its single-box system.
All of that is overlooking that the real meat and potatoes when it comes to dedupe is single-stream performance, anyway — total aggregate throughput over groups of nodes (which is really what both vendors are talking about) doesn’t mean as much. For one thing, Data Domain’s aggregate isn’t really aggregate, because it doesn’t have a global namespace yet. For another, I fail to see how EMC can even quote an aggregate TB/hr figure when talking about a group of networked nodes. Doesn’t network speed factor in pretty heavily to that equation?
Personally, I don’t think either vendor is really putting it on the line in this discussion (c’mon guys, get MAD out there ;)!). And if Avamar really performs better than Data Domain, why isn’t its dedupe IP being used in EMC’s forthcoming VTLs? (EMC continues to deny this officially, or at least refuses to confirm, but there’s internal documentation floating around at this point that indicates Quantum is the partner.)
Meanwhile, according to EMC via Mellor:
EMC says Data Domain continues to compare apples and oranges because it wants to avoid the discussion that there are a number of different backup solutions that fit a variety of unique customer use cases.
I have to admit this made me chuckle. Most of the discussions I’ve had about EMC over the last year or so have involved their numerous backup and replication products and what the heck they’re going to do with them all long-term. Finally, it seems we have an answer: Turn it into a marketing talking point!
I don’t think Data Domain even really wants to avoid that subject, either. They’re well aware that there are a number of different products out there that fit different use cases, given their positioning specifically for SMBs who want to eliminate tape.
At the same time, it’s interesting to watch the EMC marketing machine fire itself up in anticipation of a new major announcement–the scale and coordination are something to behold. This market has already been a contentious one. It’ll be interesting to see what happens now that EMC’s throwing more of its chips on the table.
According to a post on her corporate blog, Cisco’s senior vice president of the Data Center, Switching and Security Technology Group, Jayshree Ullal, is leaving the company after 15 years. Bloomberg reports that Senior Vice President, Internet Systems Business Unit John McCool will be taking over Ullal’s position, as well as her role in an advisory group to Cisco CEO John Chambers.
I’ve spoken with Ullal only once, in a Q&A after the bizarre NeoPath affair. She discussed Cisco’s plans to meld the file virtualization product it immediately discontinued into its data center virtualization products, making file virtualization a network service. Will that come about? It remains unclear, and the media-savvy Ullal would not put a time frame on it in our interview.
It’ll be interesting to see where the influential Ullal ends up. The only clue she gives in her blog is that she hopes “to re-kindle passions” for her next new gig this summer and then decide. Hard to tell right now if there’s a deeper story here, but it may be worth noting, as Bloomberg does:
Ullal’s departure follows the December resignation of Chief Development Officer Charles Giancarlo, who quit to join private-equity firm Silver Lake. Mike Volpi, a senior vice president who left in February 2007, became CEO of Internet television provider Joost in June.
Cisco fell 21 cents to $25.49 today on the Nasdaq Stock Market. The shares have dropped 5.8 percent this year.
I need to start a category on this blog called “Vendorfights.” Today’s squabble comes from two e-discovery players. In this corner: Kazeon, which recently announced that they can do your data collection work for the price of a latte. In this corner: Clearwell, whose corporate blogger responded to that with snark:
The answer (in press releases, as in politics) lies in definitions. Exactly what sort of processing would you be getting for your four dollars and change?
You’ll have to ask Kazeon to get the answer to that one, but give a venti latte to a bleary-eyed e-discovery service provider who’s just pulled an all-nighter preparing for a meet-and-confer, and they’ll tell you all about the nuances, complexities, and risks inherent in e-discovery processing that may be difficult for enterprise search/information lifecycle management vendors to grasp.
I found out about this from a Kazeon rep (despite how severely Clearwell dissed his them). To the contrary, Kazeon sees this as the start of a price war in this space as competitors flood in.
Another e-discovery blogger (Who knew there were so many?) agrees:
Any way you crunch the numbers, position the cost or spin the offering, it is just flat alarming and bordering on unbelievable for both users and technology vendors in the eDiscovery market. Bottom line, whether or not you believe that Kazeon is comparing true eDiscovery apples with the rest of the apples in the market, it doesn’t matter as this is definitely the first shot across the bow of the rest of the eDiscovery vendors..
It’s a draw for me so far, being new to this debate. What do YOU think?
Isilon had mixed financial results last quarter, reporting higher revenue ($24.1 million) than expected while losing more money ($10.1 million) than in any quarter last year. But the most important item on Isilon’s scoreboard these days doesn’t have a hard number affixed to it. That’s confidence among customers and investors.
People clearly lost confidence in Isilon during its 2007 struggles, and they haven’t yet regained it. Isilon CEO Sujal Patel said there were “headwinds” that prevented Isilon from picking up more new customers last quarter. These headwinds came from Isilon’s financial restatement followed by an audit report of questionable sales practices last year. Fortunately for Isilon, it picked up more sales in repeat orders from customers already in the fold last quarter than in any previous quarter.
“Headwinds had some impact, raising uncertainty in customers’ minds,” Patel said. “And some of our competitors may have used it against us as a competitive advantage.”
Patel said these headwinds will “take some time to dissipate,” which means he expects them to continue at least for another quarter.
Winning back confidence among investors will likely take even longer. Although operating expenses declined, Isilon remains a long way from turning a profit. And Isilon executives still refuse to give guidance for this quarter or this year, giving the impression that even they lack confidence in their ability to execute.
“Due to the lack of clarity on the forward business model, current investor sentiment remains subdued,” analyst Tom Curlin of RBC Capital Markets wrote in a note to clients today. “We expect sentiment to remain subdued pending evidence of improving execution in the coming quarters.”
Patel pledged to continue to upgrade Isilon’s clustered storage systems, promising major upgrades this year. Isilon can use any product edge it could get with Hewlett-Packard and EMC jumping into the clustered storage game and NetApp pushing to integrate its OnTap GX clustered file system into its regular OnTap operating system.
Overall, Patel he said he was encouraged by the quarter. “Although it’s still early days, I view this as an important step in our path to profitability.” Investors weren’t quite as enthusiastic, although Isilon’s share price rose $0.08 to $4.85 today.
Is it just me, or is there a bit of a sour mood going around? Must be the economy.
But angst makes for good blogging – it’s a time-honored formula. Below is a grab bag of some of edgy IT blog posts from the last week or so.
Two small vendors trying to make their way in markets dominated by storage giants made incremental yet interesting offerings this week.
Mosso, a division of Rackspace, rolled out a cloud computing platform called the Hosting Cloud in February and followed with the release of MailTrust email hosting. Those first two services are intended for users who run websites. The Hosting Cloud includes storage space, backup, patching and security that developers can execute Web code on top of. MailTrust is meant to provide messaging in that website context.
This week, Mosso disclosed that it will branch out a bit later this year with CloudFS, a cloud-based storage-only service more like Amazon’s S3 than not. As with Amazon’s service, CloudFS will provide a place for users to put files and objects on the Web and will require developers to come up with their own interfaces. According to Mosso’s co-founder Jonathan Bryce, one distinction with CloudFS is that it will have packages of supported coding libraries for each major language including .Net, Java, PHP, Ruby and Python.
The company is “committed to fanatical support” and consistency for developers, according to Bryce, and is hoping that some ISVs will write a hosted online backup interface for it the way they have with S3. Target pricing for the service will be about 15 cents per GB per month, plus bandwidth costs for non-Rackspace customers (existing Rackspace hosting customers pay no bandwidth fees for CloudFS). The service is in private beta now.
Meanwhile, Monosphere launched version 3.7 of its StorageHorizon SRM software. This version will allow customers to make fine-grained maps of their storage capacity against VMware deployment–i.e. “the storage relationships between array LUNs, the ESX server, VMware file systems (VMFS), VMware virtual disks (VMDK), guest OSs, and guest OS file systems/raw devices” according to Monosphere’s press materials.
But what’s really getting some play in the market lately is Monosphere’s claims that it can identify not just resource allocations but actual resource utilization, to a fine degree–identifying “dark” storage, which is free for use but unmapped. Monosphere has been making this claim since at least last year (I remember them talking about it with me in briefings long before this week) but it seems they’re getting more attention for it now. Among the blogs commenting on this “dark” approach to SRM is Jon Toigo’s DrunkenData:
I am not sure whether Monosphere came up with this term, but I like it. Dark Storage refers to storage that is unmapped, unclaimed or unassigned. I am not sure whether Monosphere came up with this term, but I like it. Dark Storage refers to storage that is unmapped, unclaimed or unassigned…According to [Monosphere], between 15 and 40 percent of the capacity in the corporate storage infrastructures that they have inspected with their software can be characterized as dark storage.
Could you be sitting on capacity that you didn’t know you had?
Monosphere reports that it’s doing one new installation per week and is looking to make that two in the next few months. Among their claimed customer wins are large companies in networking, insurance, automobiles and business outsourcing, though none of those can be publicly named or interviewed at this point. While there have been some SRM products that have caught on – Novus, for example, which was bought by IBM earlier this year – it’s been a tough market for startups. “Nobody’s making any money on SRM right now,” is what Forrester analyst Andrew Reichman tells me, even though his expertise is SRM. When people do buy SRM, it often comes their storage hardware vendor. It’s still not clear that even the best independent SRM tools will garner much attention from users – we’ll have to watch Monosphere and see.
While it might seem like we’re about to change our site name to SearchWANOptimizationVendorsFighting.com, I assure you it’s just coincidence. Riverbed and its rivals have had a lot to say lately, and that’s at least in part because of greater competition in their market space, though it’s not getting as much play as other hot markets like SaaS and clustered NAS.
According to Forrester analyst Rob Whiteley, with whom I’ve spoken during the whole Riverbed/AutoCAD debacle, the one incontrovertible point that can be taken out of all the back and forth is that users can no longer just evaluate this gear on price. WAN optimization and wide-area data services have become more strategic markets than when they started, and there’s going to be more fine-grained differentiation between products in this space going forward.
Another trend in this market was identified by Blue Coat’s CEO as well as Brocade officials when speaking about the acquisition of Packeteer and Brocade’s discontinuation of WAFS products, respectively. That is that WAN optimization is growing to encompass a number of fields originally thought of as separate disciplines, whether it’s WAFS being combined with network security or TCP/IP acceleration meeting quality of service. As Brocade’s spokesperson put it, “the WAFS and WAN optimization markets are converging and our customers are looking for a much broader set of functionality beyond just WAFS for remote site IT management.”
Sensing this, it would seem, Riverbed has chosen to partner with other companies to expand its capabilities. It disclosed in February that it would be adding the Riverbed Services Platform (RSP) of services for remote offices on its Steelhead appliances, and this week added network optimization services to the RSP platform. Riverbed’s latest additions are what it calls new “visibility partners” to broaden Steelhead’s WAN optimization features. Partners supplying network visibility features such as traffic monitoring, application performance monitoring and policy enforcement include Opnet, CompuWare, NetScout, Solar Winds and Opsware.
The Steelhead central management console (CMC) was also brushed up with the release of version 5.0 this week. Despite the dot-oh, it’s an incremental upgrade with the addition of the ability to create groups of appliances for policy envforcement and more granular access control roles.
Riverbed’s Alan Saldich pointed out that Riverbed’s going the partnering route because customers might already have another product they want to use with Riverbed. This could be seen as a subtle comment on Blue Coat’s plans for Packeteer, which consist of folding Packeteer’s IP into a platform existing Packeteer customers may not be familiar with. Of course it will remain to be seen which approach will win, but this new, wider context for WAN optimization is something users should consider. If they can hear themselves think over all the bickering, that is.
I have yet another story up today about the AutoCAD issue with WAN optimization products. This time, Riverbed did some testing and had Taneja Group validate it. That story is here.
In the meantime, space limitations made it impossible to include the entire response we got from Silver Peak’s director of marketing Jeff Aaron in the story, but here is more of what he had to say:
From: Jeff Aaron
Sent: Thu 5/1/2008 8:38 PM
To: Pariseau, Beth
Subject: RE: Beth Pariseau’s latest article on AutoCAD issue
Hey, Beth. The numbers that Riverbed quotes in their report for Silver Peak don’t jive [sic] with numbers we’ve seen in-house, in the field, and in tests done with AutoDesk. I am not sure if they configured our box incorrectly or if there are some other factors at play. The fact that we show negative data reduction in some examples and that Riverbed comes in first in EVERY single test should be the first indicator that these results are biased and incorrect…
I don’t see much value in going through the results line by line to point out inaccuracies as that will just continue to propagate the “he said” “she said” scenario. Furthermore, lab results are meaningless as there are dozens of variables that affect performance in live networks – including bandwidth, latency, loss, and whatever other applications are sharing the link with AutoCAD. That having been said, we only care about how we perform in real customer networks, and are comfortable that our claims will stand up if any end user decides to put us to the test (just like AutoDesk did). To that end, we encourage anyone concerned about AutoCAD performance to give us a try and see for themselves.
Just wondering – what exactly was Taneja’s role in this? They have never seen our boxes and have no hands-on experience with WAN optimization, so there is no way that they are capable of “verifying” anything about our product. Who confirmed that our boxes were correctly deployed (we certainly didn’t)? Who verified that the same tests were run on each vendors appliances in the exact same environments? … Make no mistake – this is a Riverbed report with jacked-up numbers – this is by no means a valid 3rd party verification.
A little context for the section below: Aaron had also pointed out to me that Riverbed will struggle with more recent versions of Microsoft Excel files, which he says also do some bit-scrambling. Riverbed responded that he was referring to an issue with overlapping opens of Excel files which was fixed a long time ago.
Re. Excel – we provided a “diff file” that illustrates how the bytes are being scrambled from one save to the next (without any modifications). It is clear from that that there is a scrambling issue that is similar to what is happening with AutoCAD. My point is not to dispute what Riverbed can or cannot do wrt to Excel (even though the problem they said they fixed is a completely different issue). My only goal was to point out that it demonstrates a data scrambling problem that is similar to AutoCAD, and that we don’t seem to be affected by it.
That is also why I keep referring to other applications, like Citrix and video streaming. My point is to show that we handle these applications very differently than other dedupe vendors. What are Riverbed’s thoughts on that? That is the bigger story, in my opinion. AutoCad is just the latest symptom of a bigger problem – that there are different application types that fundamentally behave differently across the WAN, and you need the right architecture to address ALL of them…
Thanks for giving me a chance to comment.
I have to say that cloud computing has made the growing IBM/EMC rivalry that much more interesting. EMC threw one of the first punches with the rollout of Fortress and acquisition of Pi–it seems EMC will probably stick to building its own infrastructure rather than partnering. But then IBM went for a partnership with one of the other most recognizable brand names in the world (aside from its own) in Google, which consumers are already comfortable using in the real world. Meanwhile, PiWorx remains in stealth. It will be interesting to see where the next leapfrog move comes from.
Incidentally, how long before Sun acquires Zmanda? They’ve already acquired MySQL, for which Zmanda offers open-source backup and now they’re buddying up with Amazon, for which Zmanda offers an interface (Amazon still requires you to roll your own GUI or get one from a partner). It could link Sun’s open storage products–via open source software!–to the cloud with Amazon. It would be just one big happy open-source conflagration…I’m still watching for it.
Meanwhile, the other tizzy lighting up my Google Reader is over the lack of a deal between Microsoft and Yahoo. Rob Enderle has an interesting post up on Google’s role in that situation. I’m wondering, as IT and cloud vendors keep pairing up, if we shouldn’t be looking for familiar faces among those next in line to be Yahoo’s dance partner.
Comments by CEO H.K. Desai on QLogic’s last week raised questions about the future of QLogic’s SANbox 9000 Fibre Channel director switch.
Several times on the company’s earnings conference call, Desai said QLogic was changing the focus on its switching business to edge and blade switches. That would seem to leave out the director switch that QLogic launched 2006 as a low-cost alternative to directors from Brocade and Cisco.
“We continue to gain traction with our Fiber Channel edge and blade switches, which is our primary area of focus,” Desai said. Later he said QLogic is refocusing its switch investments on InfiniBand and blade switches.
To financial analysts on the call, that meant QLogic would leave the SANbox 9000 behind as it begins rolling out 8-Gbit/s HBAs and switches and starts development on 16-gig technology. SANbox 9000 sales have been hurt by lack of any OEM deals with major storage system vendors such as EMC, IBM, and Hewlett-Packard that sell Brocade and Cisco switches under their brand. QLogic’s SANbox 5000 edge switches already support 8-gig connectivity.
“You indicated that edge and blades switches are your primary focus now in Fibre Channel,” analyst Clay Sumner of FBR Friedman, Billings, Ramsey & Co., Inc. said to Desai on the call. “Just curious, does that mean you no longer expect the tier one [OEM] win for your 8-gig Fibre Channel director?”
“We never give up on anything,” said Desai, refusing to clarify his position on directors for the curious analyst.
Several analysts expect QLogic to dump the directors. “Our checks indicate that going forward QLGC may not invest further in the FC high end Director-switch space but could continue to develop mid-low end FC switches and blade switches,” Pacific Growth analyst Kaushik Roy wrote in a note to clients. Roy told me he doesn’t expect QLogic to do any more development on the SANbox 9000 or build any other directors. In other words, it is getting out of the director switch business.
Not so, says QLogic marketing VP Frank Berry. “The SANbox 9000 lives on!” Berry wrote to me in an e-mail. “We continue to sell it.”
Berry also said the SANbox 9000 will be upgraded with 8-gig blades that can replace the 4-gig in there now. What’s changing, he said, is its go-to market strategy. Instead of its original target of Global 2000 firms, QLogic now sees the director as a small enterprise product.
“We’ve been successful for several years in the SME market with our
SANbox 500 line of stackable switches,” Berry wrote. “And we have learned the SME space is where we have been successful selling the SANbox 9000.”
QLogic will make more noise about SME products this summer. Then we’ll see which SANbox it expects SMEs to play in.