TechTarget’s networking reporter Andrew R. Hickey wrote an insightful piece on WAN optimization technologies embedded in Microsoft Windows Vista and Longhorn Server and how they could make separate WAN optimization boxes obsolete:
Vista and Longhorn contain redesigned TCP/IP stacks, quality of service (QoS) facilities, file systems, security systems, and WAN-friendly presentation layers for applications…TCP flow control and error recovery have been improved while remaining compatible with other TCP implementations… Microsoft has enhanced management control over QoS, meaning that network administrators might be willing to trust QoS markings from Windows machines. In addition, the native Windows file-system access protocol, CIFS, has been improved and will work with most existing applications without requiring program changes. Also, remote application delivery systems, like Windows Server Terminal Services or Citrix Presentation Server, will probably have their performance enhanced when applications are rewritten to use Vista’s Windows Presentation Foundation component.
Before you go chucking your Riverbed box out the window, though, there are a few caveats:
Vista’s security improvements interfere with some VPN clients, and certain security options could interfere with existing WAN performance or optimization products unless they’re disabled. Data-reduction compression done by external WAN optimization tools may still be very useful in some situations.
Enterprises should use caution and examine how compatible Microsoft’s technologies will be within their networks, according to Gartner.
“Windows Vista and Longhorn offer the promise of improved networking performance and security,” Gartner stated. “However, the scope and scale of the changes present significant security and compatibility risks. Most enterprises will delay large-scale deployments until after application compatibility has been verified, which Gartner expects to take 12 to 18 months. This will give networking components time to mature…. As a result, the benefits of the new Windows communications stack will not be broadly realized before 2009.”
For now, maybe try tossing a TV off the roof instead.
Unstructured data is… well, it’s unstructured. Unruly, even. Unbearable? Well, maybe that’s a bit too much, but it’s definitely hard to deal with. Word documents, email, images and MP3s, among other types of files, are crowding storage systems at an exponental rate and classifying this data can be very difficult. Analysts note that 85% of data in a typical enterprise is considered unstructured, and managing all of this data is a growing concern for many enterprises.
In this podcast, Pierre Dorion offers practical answers to the most common questions about unstructured data management he hears from storage pros today.
Download the Unstructured data management FAQ podcast.
It’s not quite a product announcement yet, but it’s worth noting here that the friendly folks at Mimosa say they’re going to use a new $17 million in Series C funding in part to launch two new products before the end of the year. The email archiving company, which claims 160-plus customers, says it’s going to add filesystems to the applications its NearPoint archive will support, which currently include emails, instant messages, and IP-based voicemails. The file archiving will be available this summer, according to TM Ravi, CEO. According to Ravi, SharePoint archiving will follow by the end of the calendar year.
Mimosa has amassed $34.5 million in VC funding since its founding in 2005, and had its first full year of revenue in 2006, but Ravi says he’s already thinking IPO. He was mum on the timing but said “the traditional venture-capital goal is to go public, and that’s our goal as well.” He added, slyly, “But if along the way someone makes an offer we can’t refuse…c’est la vie.”
On March 22, we posted a Q & A with Burton Group analyst Guy Creese about potential “gotchas” for companies considering Google Apps Premier Edition. Recently, we heard back from Google Enterprise product manager Rajen Sheth in response to Creese’s analysis.
Storage Soup: Guy Creese pointed out that Google partners with Postini for email archiving, but doesn’t offer archiving for documents and spreadsheets Do you plan to offer document archiving as well?
Sheth: I think we actually have a better story than what’s typically done with documents. Within an enterprise, documents are all over the place–they might be sitting on your laptop, they might be sitting on a file share somewhere, they might be in your email. What we provide is one central place that people can access [files]. If I want to search through all the documents I’ve ever made or gotten, I can just do a Google search on Docs and Spreadsheets.
In terms of bringing it out of Google’s repository, there are a couple of things that can be done. Number one, we allow the export of files in multiple file formats, and the second thing is we offer an XML API, which means you can access that data through XML and pull it out to a records management system or any other system, really that wants to leverage that data.
Storage Soup: Does that mean Google has no plans to get into that kind of records management or archival space?
Sheth: No. We consider ourselves a user collaboration package rather than a records management package, but we want to integrate with records management and email archiving systems people already have.
Storage Soup: What about the point Creese brings up about keyword search vs. some of the records management searches with other products–is that something Google might bring into Docs and Spreadsheets for retrieval?
Sheth: On a separate side of the business we offer clustered searches and taxonomies through Google Search Appliance. There has been some talk about offering that through these products, but what we’ve found with user studies, both with consumers and in the enterprise, that collaborative tagging of documents for future retrieval is more popular. People rarely do more than type in a couple of keywords–we’re working on a variety of things on our search capability where the front end remains the same, but on the back end we do the heavy lifting to figure out what people want.
Storage Soup: Creese said that “[Software as a service (SaaS)] companies will eventually get to the point where they can’t save everything. Even with storage prices dropping, as more and more corporations put their data into software as a service there’s going to be a tipping point coming, where either it starts to become expensive to save everything for the service and the service therefore raises its rates, or it’s just too difficult to find what’s there.” What’s Google’s response?
Sheth: I actually disagree with that. When we provide 10 GB [in Gmail inboxes] we’re doing that with the knowledge that we can serve that and serve that well at the price point that we’ve offered. There are things that are unique about how Google structures its data centers–we use commodity PC servers within our data centers and many of them to bring down storage costs, rather than large back end storage systems. So as a result of that I think we can serve storage for a much lower cost than most organziations are able to. So when we put out a package like this with a certain storage quota we do it with every intention that users will use that storage to the absolute hilt.
We’re already running millions upon millions of consumers on these services–in terms of scalability, we’ve already scaled the services to support many users. We’ve also already tested it with a large company–ourselves. Our users are very, very heavy email and document users, and we use our own systems, the same ones we have for our customers. We essentially battle-tested it to make sure that it could handle the load.
Storage Soup: Yahoo announced yesterday that they’re offering unlimited storage with their email. Is that where Gmail will go eventually?
Sheth: I think we’d go in a little bit different direction. What we’re focusing on is that when we give you a quota, we want to give you a variety of ways to use that quota, including large attachments. We want to offer a variety of ways to use that Gmail quota appropriately–provide very high quotas and provide more tools to use that storage.
Storage Soup: What’s Google’s response on Gmail’s limits on sending emails out to no more than 500 people per day?
Sheth: The issue that we have is that we want to protect our users from spammers. We don’t want spammers to use Gmail, whether it’s the regular edition or the premier edition. We also don’t want Gmail users receiving spam. That said, there is the need for the ability to deliver to multiple sets of users. A lot of organizations already have these mailing lists set up and can use them [with Gmail]. Over time, we see other parts of our product portfolio as candidates to help solve these types of problems. For example, we have a product called Google Groups, which provides the ability to have larger groups and group page, and it’s something people can use right now, and we consider it a good candidate to bring into the [Google Apps] platform down the road as well. We have groups with thousands and thousands of users.
Storage Soup: Creese also points out that Google lacks an equivalent to PowerPoint in your productivity suite.
Creese: My response to that is we’re definitely not trying to duplicate Microsoft Office. The way I would think of it is that Office is very well designed for individual productivity–an individual preparing something to present to a group of people. We’re focusing Google Docs and Spreadsheets on collaborative use case scenarios.
Here’s an interesting ancedote from a user requesting more storage at his company and likening the process to buying a car.
At least storage guys don’t wear those nasty shiny suits…
According to the New York Times, Yahoo! will offer subscribers unlimited email storage on their free webmail accounts starting in May. The company currently has a 1 GB mailbox limit; the move was attributed to “explosive growth in the size of attachments as people share ever more photos, music and videos via e-mail,” but we think it might also have something to do with rival Google making more and more noise in the email storage space.
As backup software vendors are discovering, being flexible is the name of the game when it comes to incorporating the management of some of today’s hottest storage technologies – CDP, data classification, data de-duplication or integration with VTLs – into their backup software.
I am also finding that when one tries to get updates from these companies, one needs to exercise some flexibility as well. Though I had indicated in my last blog entry that I planned to cover Symantec in this month’s blog, the two of us could not get our schedules in sync. So instead I spoke to CA and CommVault and plan to cover Symantec’s NetBackup in more detail next time – or so I hope.
This month I began by talking to Kelly Polanski, CommVault’s Director of Product Marketing, and during our conversation she gave me a statistic that set me back. She said that nearly 80% of CommVault Galaxy’s customer base already uses disk as their primary backup target – either in the form as a virtual tape library (VTL) or disk-as-disk.
This stat caught me off-guard since it contradicts what I have heard to date. For example, Bocada, an independent data protection management software product which reports on all major backup software products, recently told me they still typically see 75% of their customers using tape as their primary target for backups.
So, I did some checking to see if CommVault was like Superman in the backup software space or if other backup software vendors were seeing similar increases in their percentages of customers using disk as their primary target for backup.
Neither CA nor EMC could provide any definitive numbers as to what percentage of their customers were using disk as a primary target for backup though both know that their numbers are growing. Symantec had some numbers to share as they had recently completed an internal survey of 200 of their customers and found that 63% of them now use some form of disk-based protection.
On a side note – I do have to congratulate EMC on their strategy of boosting (inflating?) their numbers – devilish though it may be. EMC is finding more of their customers switching to disk, but they conveniently ship NetWorker with their VTLs. How much NetWorker functionality and licenses that EMC includes with each VTL I’m sure surely varies by how many billions of TBs of storage the customer buys. But, it should come as no surprise to anyone that backup to disk is escalating in new deployments of NetWorker in EMC customer sites.
Sarcasm aside, this rapidly rising rate of users backing up to disk numbers increases the urgency for backup software vendors to integrate the management of each of these different technologies. For as time-consuming as it is to log in to manage each CDP, replication and backup product, it becomes even more difficult to create a consistent set of policies across these products that ensure the level of data protection and recovery matches the application’s requirements.
Of course, the difficulty arises from the fact that each of these different products usually makes its own copies of data, has its own database and is driven by its own policy engine. From a global management perspective, this makes it almost impossible to achieve any consistent method of locating the right copy of data, applying policies centrally or really knowing where anything is.
Both CA and CommVault (I know, it took me a while to get here) address these issues but are taking different paths to do so. This month (March 2007), CA is releasing a service pack (SP) for their BrightStor ARCserve backup softwarethat will more closely tie together their ARCserve and WanSync replication software. This SP provides ARCserve with an interface into the WANSync product and allows ARCserve to backup copies of data that WANSync creates. While a step in the right direction, this is more of a patch job than anything really innovative.
CA’s longer term plan is much more intriguing, if they can pull it off. CA is leveraging its acquisition of MDY Group International and its enterprise records management software (soon to be named CA Records Manager) that they completed in June 2006 to lay the foundation for enterprise-wide policy management for any product database.
According to Kristi Perdue, CA’s Product Marketing Director for Information Management products, the CA Records Manager will provide users a centralized policy engine that they can apply to any vendor’s product data repository. Configured modularly with an open architecture, it permits organizations to use a common set of policies for any vendor’s replication or backup product. (I ought to be in marketing for CA, you think?)
Overall, not a bad idea, but unfortunately at this time it is still vaporware. Even though CA’s Perdue describes CA’s integration efforts as “very aggressive” in this area, I wouldn’t expect to see a product release from CA for at least another year.
Of the two, at least CommVault’s technology is real. All of their replication products – Galaxy, QuickRecovery, and ContinuousDataReplicator – use the same underlying database and share a common set of policies. It even extends to setting policies for performing data archiving and data migrations which is great – assuming you are using their product exclusively on all of your servers.
This is my main concern about CommVault, unless you are exclusively using CommVault’s product, you may still have to bring in something like CA’s Record Manager to manage CommVault along with all of your other backup software products. But, whether that is a flaw in CommVault’s product design or a larger indicator of how enterprises run their businesses or let their businesses run them is a topic for another day.
Hitachi Data Systems CTO Hu Yoshida has an interesting post up on his blog that predicts storage is headed for a “bust” period in terms of petabytes shipped per year. He’s basing this in part on IDC’s recent numbers which show that server shipments are down thanks to virtualization. Yoshida predicts a similar boom in storage virtualization will improve utilization on storage arrays, staving off additional shipments of disk in the coming years.
In 1999 we had a 100% growth in 1999 during the tail end of the dot com boom and the run up to Y2K. In 2000 and 2001, we saw the rate of capacity growth slow down sharply as the industry went through a period of consolidation after the excesses of the boom and Y2K preparation. I believe we are in a boom cycle now and are headed for another bust.
I believe we are ready for another round of storage consolidations, which will drive the growth rate down below 50%.
He’s got a point: the name of the game in storage currently is consolidation and improved utilization, and it’s clear users are serious about finding ways NOT to throw hardware at a problem. It’s also unusual for a major storage vendor to predict any kind of decline in their market, and Yoshida is among the most knowledgeable names in the business–so we’re paying attention.
But having increased storage efficiency as the acknowledged goal and reaching that goal are two different things, after all…and the cynical side of us is tempted to think this is a rationalization of IDC’s recent storage numbers, which showed Hitachi squarely in fourth place in most external disk categories with growth rates for the fourth quarter of ’06 hovering between 2 and 5% (EMC, IBM and NetApp were all consistently showing double-digit growth rates in these same numbers). And the utilization angle doesn’t necessarily explain why HDS fell 37.9% year over year in storage device management software and 24% annually in the IDC software tracker either.
Curiouser and curiouser.
Another software company riding the Google wave came to our attention this week–Datacatch, an Australian company which already markets an indexing tool for offline media written by Windows clients, including tape, CDs, DVDs, Blu-Ray and HD-DVDs, as well as flash drives.
Datacatch has been marketing its Data Librarian product for $39 to small-office and home-office users, as well as small enterprises. This week, the company announced a new free plugin to Google Desktop that will merge Google Desktop Search with Data Librarian, based on Google’s API for developers.
According to Datacatch’s CEO Lindsay Lyon, the product currently is Windows client-based and can’t be run centrally, but updates are planned for the third quarter of this year that will create a networked-storage version of Data Librarian could make it possible for midsize organizations to add Google Desktop to their backup clients, a far less expensive proposition than something like Index Engines’ enterprise-level eDiscovery Appliance, which performs similar searches on offline tape media starting at a cool $50,000.
Granted, Lyon admitted, Data Librarian “is not intended to be the panacea for e-discovery and compliance”; if you have strict regulations you’re better off with an enterprise-class indexing product. But one place Lyons said the product could fit in the enterprise is as a personal assistant to enterprise IT pros managing hundreds of CD-ROMs worth of development software licensing subscriptions, for example.
“A lot of software developers also archive code to DVD in test and development environments,” Lyons said. “Most IT pros have dozens of thumb drives, CDs or DVDs with licenses or installation files on them and other removable media in use that aren’t necessarily a part of the company’s main backup workflow.”
The product can be purchased online at http://www.datacatch.com/purchase.html.
EMC and Microsoft announced a partnership today under which Microsoft will integrate EMC’s Smarts network discovery and modeling software into future versions of Microsoft’s Systems Center Operations Manager. The companies also said they will be working on common models for networking and storage, going forward.
In October 2006, EMC said it would integrate its Documentum enterprise content management system with Microsoft’s Office and SharePoint 2007, SQL Server 2005, and enterprise search offerings.
And prior to that, in January 2006, EMC strengthened its Microsoft competencies by acquiring Internosis Inc, a specialist consulting and service provider for Microsoft shops.