As I continue to delve more deeply into next generation data protection technologies, I continue to talk to users about their experiences. Of these technologies, there are always some that users find more relevant than others and with no technology does that seem more true than with backup deduplication.
Granted, users that I interview for the different columns and articles I write are often supplied by the vendors so they are certainly not going to provide examples of users with failed installs of their products. Also users who do agree to do interviews often put their best foot forward to put their experience in the best positive light since no one wants to go on the record sounding like they made a bad decision. But having worked as an end-user, I can usually tell pretty quickly by the tone and inflection of a user’s voice how much of their experience is genuine and how much is contrived.
And what I am hearing – and maybe more importantly sensing – from those employing deduplication is that it is working as well as vendors advertised – at least in SMB and remote office environments – and in some cases, maybe better. Too often I find vendors exaggerate the benefits of specific technologies but in talking to users employing deduplication I don’t sense that is happening here.
When talking to users about their deduplication deployment using either backup software or VTL products, they seem genuinely content. While admittedly every one has had some issues, none appear beyond the scope associated with the deployment of any new product and certainly pale in comparison to the ones they encountered on a daily and weekly basis with their previous tape based approaches.
Most users simply sound relieved that they have had success in dealing with their daily backups and can now finally begin to turn their focus to more importang strategic initiatives like performing tests to ensure they can recover their data and offsite disaster recovery. And while the user experiences and emotions I am discussing here certainly shouldn’t translate into anyone into going out and buying a backup deduplication product, I think it certainly merits one taking a closer look at this class of technology.
It’s a dirty job working in this kind of environment,
but somebody’s gotta do it…
Propellerheads and communication
Among the sessions at a new professional development track being tried out at SNW this year was a talk entitled “Interpersonal Communication Skills for Propellerheads” led by Deborah Johnson, CEO of Infinity I/O. As far as we could tell said propellerheads, some 30 in all, didn’t object to the moniker.
Johnson addressed attendees about other humans using their native language, i.e. technical jargon, telling her audience that interpersonal communication requires the befuddled propellerhead to “assess the ‘map’ of the person you’re talking to—” similarly to how they’d map a drive, we surmise. “Understand your goal and the audience context to select the right channel for your communication,” Johnson told the group, approximately one-third of whom were thumbing away busily on Blackberries or typing on laptops.
“Sometimes it takes more than one communication event to get your message across,” Johnson continued, further encouraging attendees to “ask questions to ensure you are decoding messages correctly [from others].”
So was it useful? “I do need to work on my communications skills,” said one self-professed propellerhead, adding that he has recently begun to cut his emails down from several pages to a strict one-page limit. Oy vey!
Deep dive on dedupe
An early session on deduplication turned into a standing room only event, with Curtis Preston, VP of data protection services at GlassHouse, a.k.a Mr. Backup, holding court on the topic. He went through the different products, in-line versus post-process and the different schemes for identifying redundant data. But two points really stood out. First, data deduplication products are currently only appropriate for small to medium-sized environments, he said. “Do not take a 100 TB Oracle database and throw it at a data dedupe.” Second, ignore all the claims about deduplication ratios. “Your data will dedupe very differently to the guy next to you,”
Talking to a couple of users after the session, one thought dedupe could go the way of CDP. “It’s the big topic this year but we’ll see if it’s still around next year,” he said. Another user, with 800 TBs to deal with, said the economics were too good for this technology to be a flash in the pan, if, and that was a big if in his mind, the products are robust and scalable enough. He noted that’s FalconStor’s SIR (single instance repository) doesn’t ship in volume for another couple of months, so it’s still very early days for this technology.
Users feeling out file virtualization
Comcast Media Center manager of server and storage operations Paul Yarborough gave a talk Monday afternoon on his company’s decision to virtualize NetApp 3020 filers with Acopia Networks’ ARX switch. Another presenter on file virtualization, Stephen Warner of Quest Diagnostics, has also deployed Acopia, to virtualize EMC Celerra boxes. Some tidbits that arose out of the presentations:
Yarborough’s company was so strapped for space on the NetApp filers due to the 16 TB filesystem restriction that they were spending dozens of man-hours on a regular basis reingesting digital content that had been deleted from overutilized disk.
Meanwhile, Yarborough said he evaluated NetApp’s OnTap GX as a means to solve the filesystem limit, but he remained pointedly noncommittal on his findings, saying only that it was a very new product when he evaluated it.
Warner, who heads up an EMC shop, said he believes that truly vendor diagnostic virtualization will not come from a large vendor. In Acopia, he said, his company found a startup it could influence (of course, it helps to have just under a petabyte of data under management if you’re looking to influence other companies).
Other users’ questions during Yarborough’s session were as interesting as the presentation itself. During the Q&A users peppered Yarborough with questions about performance impact, how much training it had required to get his staff up to speed on the Acopia product, and whether or not Acopia was truly effective in virtualizing Windows and Linux systems equally. Yarborough answered that there had been no performance impact that he’d seen, that training on the Acopia switch had taken a little longer since it operates on a switch and his admins are not used to managing switches, and that yes, Acopia is effective in virtualizing heterogeneous OSes.
Another user questioned the fact that Comcast had installed an Acopia agent on its domain controller. “That would never fly in our environment,” the commenter said.
First Intel. Now the White House. We’d love to know what, if any, email archiving products these guys are using. And we don’t know what’s worse–if they have implemented archiving procedures, or if they haven’t.
Two of the storage industry’s most prolific (and diametrically opposed) bloggers have posted–as one would expect–lengthy and diametrically opposed blog posts about the announcement this week that storage vendors are proposing a new Fibre Channel over Ethernet standard (FCoE).
Chuck Hollis of EMC writes that Fibre Channel over Ethernet will rectify ongoing concerns with iSCSI related to (you guessed it) reliability and performance. (Hollis also wrote a post a while back questioning whether iSCSI is really going anywhere.)
Meanwhile, on Drunken Data, a blog belonging to DR expert and maverick analyst John Toigo a commenter dismisses FCoE as pure marketing fluff from FC vendors desperate to hang on to market share. “In my view,” the commenter writes, “FCoE [follows] the rule, ‘If you can’t beat them, join them; if you can’t join them, confuse them.'”
You couldn’t find two more different viewpoints–neither writer even shares the same basic premise about the status of FC in the market in general. As always, reality is probably somewhere in between.
FUD reigns supreme when it comes to the various new standards being proposed for Fibre Channel these days–those not participating or opposed to the standards efforts like to paint them as last-ditch ploys by FC vendors to retain market share. Supporters of the standards, however, make compelling arguments for a future of converged networks and multiprotocol systems for all. Unfortunately, aside from one analyst (Brian Garrett of ESG, quoted in our news story on the new FC-SATA spec being touted by Emulex), all the supporters of the specs we’ve talked to so far are Fibre Channel vendors.
The bottom line: the proof is in the pudding. We’re not ready to declare the standards unadulterated marketing fluff, but we’re not seeing multi-lateral industry support for any of them, and we’re certainly not seeing any storage end-users, integrators, resellers or consultants–in other words, anyone who ever actually works with storage products–being asked for their opinion by any of the standards committees.
Like many in the storage industry, I keep wanting to declare tape is dead, or at least on its last legs, when it comes to data protection. I surely thought the triple whammy of emerging technologies like deduplication, asynchronous replication and removable disk cartridges were going to finally drive the nail through the heart of tape.
I couldn’t be more wrong. Last week, I had an insightful conversation with the president of a records management firm based in New Jersey which has, for years, serviced high-end customers like financial services and pharmaceutical firms in the Northeast. They also anticipated more of their clients replicating data to them and, for years, reserved space on their floor for disk for this purpose. Instead, they are installing more racks to hold tape and don’t foresee disk happening at all or not nearly to the scope they foresaw years ago.
They are finding that even though clients want or do replicate data, they also find users on the high end of the spectrum want control of the data in its useable format. Once replicated, users spin it off to tape and store it long term with them for long term archival and data recovery under potentially catastropic circumstances. They want the records management company to know as little about the data they are sending them as possible, and they want the data stored in a format inaccessable to anyone but the user. This is a trend that bears watching in the tape market for even as disk cartridge, deduplication and replication technologies take off, new reasons to keep data on tape are emerging.
FAN or File Area Network is the latest buzz word for file virtualization, coined by Brad O’Neill, senior analyst at The Taneja Group. (He has some big clients in this space.) In June 2006, O’Neill reported that a Taneja Group survey of global IT decision makers, found that 62% of respondents now identify “file management” as either “the top priority” or “one of their top priorities” requiring immediate attention in their data centers. Meanwhile, Tony Asaro, senior analyst at the Enterprise Strategy Group has just posted an interesting blog on the FAN market, or rather what he sees as the lack of one. So which is it? Where have all the FANs gone?
We received the following comment from a VAR based in Florida on our piece covering the newly proposed Fibre Channel over Ethernet (FCoE) standard:
The concept of FC over Ethernet has very limited value. According to this article, the FCoE consortium is targeting this at low-to-mid range servers, over 10 GbE, and as a convergence technology. While 10 GbE makes sense from the storage array to the switch, it makes little to no sense from the server to the switch for two reasons: first, low to mid range servers — where this is targeted, don’t have the I/O requirements to saturate 1 GbE much less 10 GbE, and their PCI busses would not be able to handle anywhere near 10GbE throughput (do the math), and second, the reason that FCP exists is not because of throughput, but deterministic response time, which is guaranteed by the FC protocol wheras the Ethernet protocol becomes more non-deterministic with high load. This lack of deterministic response time will not be fixed with FCoE.
For these reasons, FCoE to the server does not make sense on the low/mid (don’t need the throughput and couldn’t handle it anyway) or the high end (Ethernet lacks the predictable response time of FCP). So back to the question of FCoE over 10 GbE from the storage array to the switch — if the storage arrays were 10GbE capable, why not just use iSCSI, which is already supported and in wide use in the enterprise despite what some manufacturers’ marketing and media reports say? My personal opinion is that this is an effort on the part of manufacturers who are behind in iSCSI to change the game in an effort to compete, and provides little to no value to consumers.
We have the feeling this could develop into an interesting discussion in the industry over the next year or so as FC and iSCSI, originally at odds in the market, have increasingly been combined in tiered storage environments and multiprotocol systems. Still, combining the two protocols–especially in the same data stream–could become a thorny issue.
What do you think?
TechTarget’s networking reporter Andrew R. Hickey wrote an insightful piece on WAN optimization technologies embedded in Microsoft Windows Vista and Longhorn Server and how they could make separate WAN optimization boxes obsolete:
Vista and Longhorn contain redesigned TCP/IP stacks, quality of service (QoS) facilities, file systems, security systems, and WAN-friendly presentation layers for applications…TCP flow control and error recovery have been improved while remaining compatible with other TCP implementations… Microsoft has enhanced management control over QoS, meaning that network administrators might be willing to trust QoS markings from Windows machines. In addition, the native Windows file-system access protocol, CIFS, has been improved and will work with most existing applications without requiring program changes. Also, remote application delivery systems, like Windows Server Terminal Services or Citrix Presentation Server, will probably have their performance enhanced when applications are rewritten to use Vista’s Windows Presentation Foundation component.
Before you go chucking your Riverbed box out the window, though, there are a few caveats:
Vista’s security improvements interfere with some VPN clients, and certain security options could interfere with existing WAN performance or optimization products unless they’re disabled. Data-reduction compression done by external WAN optimization tools may still be very useful in some situations.
Enterprises should use caution and examine how compatible Microsoft’s technologies will be within their networks, according to Gartner.
“Windows Vista and Longhorn offer the promise of improved networking performance and security,” Gartner stated. “However, the scope and scale of the changes present significant security and compatibility risks. Most enterprises will delay large-scale deployments until after application compatibility has been verified, which Gartner expects to take 12 to 18 months. This will give networking components time to mature…. As a result, the benefits of the new Windows communications stack will not be broadly realized before 2009.”
For now, maybe try tossing a TV off the roof instead.
Unstructured data is… well, it’s unstructured. Unruly, even. Unbearable? Well, maybe that’s a bit too much, but it’s definitely hard to deal with. Word documents, email, images and MP3s, among other types of files, are crowding storage systems at an exponental rate and classifying this data can be very difficult. Analysts note that 85% of data in a typical enterprise is considered unstructured, and managing all of this data is a growing concern for many enterprises.
In this podcast, Pierre Dorion offers practical answers to the most common questions about unstructured data management he hears from storage pros today.
Download the Unstructured data management FAQ podcast.
It’s not quite a product announcement yet, but it’s worth noting here that the friendly folks at Mimosa say they’re going to use a new $17 million in Series C funding in part to launch two new products before the end of the year. The email archiving company, which claims 160-plus customers, says it’s going to add filesystems to the applications its NearPoint archive will support, which currently include emails, instant messages, and IP-based voicemails. The file archiving will be available this summer, according to TM Ravi, CEO. According to Ravi, SharePoint archiving will follow by the end of the calendar year.
Mimosa has amassed $34.5 million in VC funding since its founding in 2005, and had its first full year of revenue in 2006, but Ravi says he’s already thinking IPO. He was mum on the timing but said “the traditional venture-capital goal is to go public, and that’s our goal as well.” He added, slyly, “But if along the way someone makes an offer we can’t refuse…c’est la vie.”