It’s been a slow news year for data deduplication. The data reduction technology has yet to make its big splash for primary storage and is taken for granted for backup. But things picked up this week as EMC Data Domain, FalconStor, Hitachi Data Systems and Permabit all either expanded their dedupe products or talked about their plans.
Permabit aims dedupe software at flash arrays
With the adoption rate of dedupe for primary storage slower than anticipated, Permabit this week unveiled Albireo for Flash Technologies, which is really a flashy way of saying it supports solid-state storage with its Albireo Software Development Kit (SDK) and Virtual Data Optimizer (VDO) for Linux.
Permabit does not sell Albireo software directly, but makes its SDK and VDO available for OEM partners.
Permabit founder and CTO Jered Floyd says primary dedupe adoption is slow because the large established storage vendors resist the notion of cutting into disk sales by shrinking data. (The large vendors dispute this, and all have or are working on some type of dedupe for primary data). Floyd maintains the benefits and needs for primary dedupe for flash are greater than for disk arrays, and the startups selling flash systems are more open to incorporating dedupe.
“We believe dedupe will be a basic required feature for any flash platform,” he said. “Permabit makes it so these companies building new flash platforms can easily and rapidly integrate dedupe.
Does dedupe have to be different for data on flash than hard disk? Floyd said there are benefits and challenges for dedupe on flash that goes beyond dedupe on hard drives. He said dedupe can not only significantly lower the cost per gigabyte of flash but also help improve latency and reliability and avoid wear by reducing the number of writes on a system. Floyd claims Albireo can meet the high demands of flash by handling more than 250,000 IOPS on a single core processor.
Permabit CEO Tom Cook said “a handful” of flash vendors are involved in the early access program for Albireo and he has commitments form a few. He expects to announce deals in the second half of the year.
It will be interesting to see who signs up for Albireo. All-flash startups such as Nimbus Data, Greenbytes, Pure Storage, SolidFire, and XtremIO have dedupe or are promising it for when they begin shipping. Does that mean the market for Albireo is smaller than Permabit anticipates?
“It would be a mistake to assume we’re not working with vendors who have announced dedupe but have not yet delivered,” Floyd said. “Not having dedupe in a flash storage system is going to be a huge liability.”
HDS prepares primary dedupe appliance
Hitachi Data Systems is planning primary data reduction for its newly released Hitachi Unified Storage, as well as a deduplication appliance, according to Fred Oh, HDS’ senior product marketing manager for NAS. He said data reduction for the file portion of the HUS will be available this year and the appliance is expected in the summer. Oh wouldn’t say if HDS is using technology from Permabit, which had an OEM deal with NAS vendor BlueArc before HDS acquired BlueArc.
FalconStor provides inline dedupe option
FalconStor added inline dedupe to its virtual tape library (VTL) product, FalconStor VTL 7.5. FalconStor now supports inline, concurrent and post-processing dedupe as well as its Turbo dedupe option for post-processing.
In the early days of dedupe, the inline versus post-process issue was hotly debated. Inline requires less disk capacity on the back end because it reduces data before moving it to the backup target. Post-processing dedupes at the target, so it requires more capacity but is usually the faster method. Faster processors have alleviated inline dedupe speed concerns, and some of the early post-processing advocates have added an inline option or switched from post-processing to inline.
FalconStor claims its dedupe options are the most flexible.
“We added inline dedupe as a fourth choice,” said Darrell Riddle, FalconStor senior director of product marketing. “We see it as a good fit for smaller systems or systems that need more power up front.”
For a four-node VTL cluster, FalconStor claims its inline dedupe can handle more than 28 TB per hour and post-processing dedupe can back up more than 40 TB per hour.
FalconStor’s concurrent dedupe runs post-process, but does not wait until all backups are completed before deduping on the back end. Riddle said FalconStor VTL customers can also turn off dedupe if they have little or no compressable data.
FalconStor VTL 7.5 software costs from $2,500 to $4,500 per terabyte under management, depending on the configuration.
EMC gives Oracle RMAN a DD Boost
Just because Google Drive is aimed at SMBs and consumers doesn’t mean the cloud storage service will have no impact on enterprises.
Google Drive will almost certainly add to the consumerization of IT that Randy Kerns recently wrote about because it will expand the number of users functioning as their own storage administrators. And the attention it has already sparked will make it more likely that most businesses will at least consider using the cloud for some of its file storage and data protection.
“On the face of it, this topic does not appear to concern the corporate IT manager or CIO, but chances are employees will start using this service to do more than share family photos and recipes,” Ovum principal analyst Richard Edwards wrote in an e-mail about Google Drive’s impact on the enterprise. “Corporate email systems are notorious for their measly storage quotas and message attachment size limitations, and so the sharing and distribution of large corporate files, such as PowerPoint presentations, engineering drawings, and creative content are an obvious use case for Google Drive.”
Edwards said Ovum recommends what he calls “business-grade” cloud collaboration services such as Box and Huddle because of their superior feature management and administration capabilities. Google Drive is seen as a prime competitor to these services as well as other popular file sharing clouds from Citrix, Dropbox, Egnyte, Nomadix, SpiderOak, SugarSync and Syncplicity.
Andres Roldriguez, CEO of cloud NAS vendor Nasuni, said Google Drive can go beyond the file sharing services already on the market because it controls the application stack and a mobile operating system. And while he doesn’t see Google Drive as a competitor to enterprise storage vendors, he does warn that enterprise vendors need to address data on mobile devices in a hurry.
“File storage and synchronization engines are changing storage as we know it,” Rodriguez wrote in an email. “Any large storage vendor that isn’t thinking about how to extend its current data center offerings to mobile is going to be unpleasantly surprised in the next 24 months as more workers shift to accessing data from tablets and smart phones. The pressure on IT is already intense. The control points for much corporate storage today are the Domain Controller (DC) and the CIFS protocol. No one wants to re-architect access control because of mobile users. What we need to figure out is how to extend the access control model we have today to include the new platforms.”
Ranajit Nevatia, VP of marketing for Nasuni rival Panzura, says Google Drive is a long way from becoming an enterprise service because adding features such as global namespace, file locking and enterprise encryption is “damn hard.” He said there is a big difference between file sharing and project sharing, which is what enterprise storage must support.
“Google Drive, Box, Dropbox, iDrive, these are becoming a dime a dozen now,” Nevatia said. “Everybody’s coming up with file sharing with free amounts of storage associated with them. When you look at the target market and use cases they’re going after, it’s not overlapping with what we’re doing. It will put pressure on consumer level file sharing services, but it’s not meant for large enterprises. Our customers collaborate on projects like architectural engineer design or handle large amounts of research data. We’re not talking about two gigabytes or five gigabytes. We’re talking terabytes of data.”
Tom Gelson, Imation’s director of business development and its cloud strategist, said he has mixed reactions about Google’s entry into cloud storage. Imation’s data protection appliances are used by cloud providers and Gelson said the vendor plans on launching its own cloud service. And as an SMB vendor, that would make it a Google competitor. But Gelson agrees with Nevatia about the need for security in the cloud.
“Google rubber stamps cloud backup, because everybody knows Google,” he said. “It’s exciting, but we’re all concerned. Imation is focused on SMBs and if you talk to an SMB IT director, the biggest concern is security. That’s Imation’s biggest focus. We want to make sure data is secure once it sits on the cloud.”
Gelson pointed out Imation acquired three security companies in 2011-– Encryptx, MXI and Iron Key. He said Imation encrypts data in flight to the cloud, and also encrypts data on its RDX removable hard drive media.
Ethan Oberman, CEO of online file sharing company SpiderOak, brings up another potential sore spot for Google – privacy. Oberman wonders if Google will try to integrate Google Drive with Google Plus and if it will record users’ activities.
“Google has definitely been one of the more innovative companies since its inception, so the market will have high expectations for how Google Drive might change the way we work within the cloud,” said Oberman wrote in an e-mail statement. “There is obviously a very fine line between harvesting consumer data across Google platforms for a ‘richer experience’ versus the potential reality that every step we take on Google’s turf is recorded and analyzed. How Google addresses the 800-pound gorilla knocking on the door – privacy – will define how the company is widely perceived by the public. Google Drive will be a key part of this test.”
Startup Symform has a peer-to-peer cloud storage and backup model that seems a bit whacky at first – its cloud consists of disk space from users’ PCs, servers and NAS devices. But Symform’s execs say they have the security and data distribution figured out, and today they picked up more funding to expand their engineering and sales teams.
Symform is calling it an $11 million B round, but only $8 million is in hand. CEO Matthew Schiltz said he expects the other $3 million to come from a strategic partner. He said he’s still talking to possible partners, but expects the deal to include a business development deal as well as funding. When Schiltz was CEO of DocuSign, he secured a business development/strategic funding deal with Salesforce.com.
“We will be doing something similar this year,” he said of his plans for Symform.
Symform president and founder Praerit Garg describes his company’s Global Cloud Storage Network as “a giant RAID system over the Internet. It’s a distributed, decentralized cloud that’s more secure and reliable than a data center. People contribute part of their disks and we aggregate storage across these disks over the internet.”
Garg said the data is encrypted before it leaves customers’ computers. The encrypted data is striped over 96 disks – “we call it RAID 96” – and 32 of the fragments are redundant. That means data can be reconstructed even if 32 fragments are lost.
“We encrypt data, chop it up and geo-spread it,” Schiltz said. “We have our own cloud-controlled brain that manages the peer-to-peer network. We don’t have to build a massive data center to store massive amounts of data.”
Symform software comes preconfigured on QNAP NAS devices, and Schiltz said the startup has about 500 resellers for its cloud. QNAP customers pay $20 per month per bay for the Symform cloud if they contribute space from their device.
Customers who download Symform software on their computers get 10 GB of free cloud storage to begin with. They get another GB free for every GB they give up on their hard drive, up to 200 GB. Beyond 200 GB, Symform charges a subscription fee starting at $3.50 per month for an end-user license and $50 per month for a server license.
Symform also has a partnership with SMB backup software vendor StorageCraft for a Business Continuity Suite service that offers rapid data recovery.
Besides its venture funding – new investor WestRiver Capital led the B round with participation from previous investors OVP and and Longworth Venture Partners – Symform also launched an advisory board. That board consists of Quantum CEO Jon Gacek, DocuSign VP of Engineering Grant Peterson, and Dimitris Achilopta, professor of computer science at the University of Cal-Santa Cruz.
Symantec today said its sales for last quarter came in below expectations, impacted in part by customers waiting for its Backup Exec refresh. But while its storage management and backup products slumped, CEO Enrique Salem said he expects Symantec’s NetBackup and Backup Exec appliances to help return business to normal.
“While we experienced a pause ahead of our Backup Exec product refresh, we continued to see momentum in our backup appliances,” Salem said during a call to address the earnings shortfall.
Symantec CFO James Beers the vendor anticipates reporting approximately $1.68 billion in revenue when it officially reports earnings May 2. Its original forecast for the quarter was between $1.72 billion and $1.73 billion. Symantec’s revenue was $1.67 billion for the same quarter last year.
Although overall revenue rose slightly from last year, the storage and server management group declined approximately 5% with an approximate 8% drop storage management and approximate 3% decline in backup and archiving.
Symantec recently upgraded both major backup applications, and now sells all of its backup products on integrated appliances. Salem said he expected “meaningful acceleration” of the appliance business over the next year.
“I expect that with the refresh of NetBackup and Backup Exec, that will return to more normal business,” Salem said. “We have the ability to sell software, the media server and deduplication in one device, and that’s something none of our competitors do.”
Earnings reports over the next few weeks should show if Symantec’s problems were limited to the vendor or industry-wide. Last week, EMC said its backup revenue grew. CommVault, FalconStor and Quantum are expected to report in the first or second week of May.
Last week, EMC CEO Joe Tucci repeated the storage giant’s commitment to all types of flash. During the company’s earnings report, Tucci pointed to products such as EMC’s recently launched PCIe-based solid state VFCache card, 100% flash arrays and hybrid systems consisting of flash and spinning disk. He proclaimed “this category of storage will undoubtedly make up the vast majority for years to come.”
Now it appears that EMC may add one of those product types by acquiring all-flash storage array startup XtremIO. Israeli business newspaper Globes today reports that EMC is discussing a buyout of the Tel Aviv-based startup for $400 million to $450 million.
While EMC can offer its traditional arrays with all solid-state drives (SSDs) in place of hard drives, XtremIO is part of a rapidly growing group of startups that engineered their systems from the ground up to take advantage of flash. The XtremIO Flash Array is still in customer trials. The vendor positions it as a way to lift I/O constraints for applications such as Oracle or SQL databases, ERP systems, and virtual desktop infrastructures or other heavily virtualized environments.
One of XtremIO’s founders, Shuki Bruck, also founded file virtualization vendor Rainfinity and sold it to EMC in 2005.
An EMC-XtremIO acquisition could start off a feeding frenzy for traditional storage vendors looking to accelerate their ability to take all-flash arrays to market. Globes reported NetApp executives have also visited Israel to talk to XtremIO (Wall Street rumors also say NetApp is looking at buying Fusion-io). Other all-flash vendors that might make acquisition targets include Violin Memory, Nimbus Data, SolidFire, Texas Memory Systems, Kaminario, GreenBytes, Pure Storage and Whiptail.
EMC executives today said the price increase for hard drives put into place late last year will continue for most of this year. They also confirmed expectations that a new high-end Symmetrix VMAX storage system and the “Project Thunder” flash caching appliance are coming soon.
Despite a seven percent revenue growth to $3.7 billion for information storage products last quarter, EMC CFO Dave Goulden said during the vendor’s earnings call that it struggled to meet demand for high-capacity hard drives. Goulden said the drive shortage caused by Thailand floods last year is improving, but EMC will keep its 5 percent to 15 percent price increases at least into late 2012.
“There were and still are constraints in nearline drives,” he said. “We got the drives we needed to make our numbers, but nearline drives came in late and we had to do some balancing to meet supply and demand. There will be constraints in certain classes of drives the entire year.”
Goulden said he doesn’t think the drive shortage cost EMC any customers because “everbyody’s in the same boat when it comes to drive availability.”
His comments were in line with Seagate’s claims during its earnings call earlier in the week that the shortage has eased for some drive types, but high-capacity nearline drives are still restricted.
Despite its revenue growth last quarter, EMC’s high-end storage declined 10 percent from last year. EMC execs said that was largely due to an unusually strong first quarter in 2011, but EMC CEO Joe Tucci agreed with an analyst who asked if it might also be caused by customers waiting for a VMAX product refresh.
Pointing out the current VMAX platform launched three years ago, Tucci said, “our customers are expecting a new high-end product. We don’t want to ruin our announcement, but customers expecting that will not be disappointed. It’s coming soon.”
Tucci also said more details on Project Thunder will be disclosed at EMC World next month, and it will go into beta over the next few months. EMC COO Pat Gelsinger added that he considers the Project Thunder shared storage appliance more lucrative then the VFCache “Project Lightning” host-based PCIe flash card launched earlier this year, because Project Thunder is more in line with EMC’s storage background.
“A Thunder-like appliance is an easier product for the EMC sales force,” he said. “There is a lot of interest for the Thunder appliance in many use cases. We’ve accelerated our internal activities for VFCache, Thunder, the use of MLC [multi-level cache], and hybrid arrays. A large majority of the industry will be hybrid arrays for the long term.”
Tucci added that EMC is committed to all types of flash – including solid-state drives (SSDs) in storage arrays, 100 percent flash arrays, and hybrid arrays – as well as Fibre Channel and SATA hard drives. “For sure, information storage is not a one-size-fits-all world,” he said.
Tucci also addressed another favorite EMC topic, the cloud. He said private clouds will be the most popular type of cloud for a long time, but “we believe the world [eventually] is going to be hybrid. Customers are working on virtualizing and private cloudizing tier one applications in significant numbers. That’s where the action is. But when customers get to peak times they’ll push some apps out to the public cloud so they don’t have to buy capacity for peak times.”
Other tidbits from the EMC call:
• Isilon revenue nearly doubled from last year, with the help of a 28 PB purchase from a web company.
• VNX unified storage systems has brought EMC nearly 6,000 new customers since it launched in early 2011.
• Revenue from midrange products (VNX, Data Domain, Avamar, Isilon) grew 26% year over year.
The National Association of Broadcasters (NAB) conference in Las Vegas this week drew a large number of storage vendors vying for the growing media and entertainment storage market. I’ve attended this conference the last five years, and seen more storage vendors every year. The storage vendors who go to NAB include those well-known in the IT space plus others that specifically focus on media and entertainment.
The target audience is different in the media and entertainment space than in general IT. The backgrounds of the people looking to store media content are different from those in traditional IT and their needs are also different. Their titles do not translate directly to mainstream IT, and they use unique terminology that requires knowledge of their business to really understand.
This poses a challenge for storage vendors. To meet their needs, the vendors must understand these differences and speak their customers’ languages.
They need to understand that the applications that store and retrieve information are also different. The workflow in media and entertainment dictate the type of applications that will be used at various times during production and delivery. Another critical consideration is the need for data interchange. This role is still handled by removable media in many cases.
The media and entertainment market requires large amounts of data that is growing exponentially, driven by improved camera resolution driving higher capacity being produced. Special purpose systems are used to modify (edit) data and multiple operations and people are used in the workflows. Data requirements change during the workflow process. Storage systems must support high performance for post-production, large numbers of streams for broadcast, and high integrity with large capacity for archiving.
Characteristics such as point-in-time copies that are crucial in traditional IT have only nominal value in media and entertainment. Vendors need to promote the right set of features to reach these companies. Without the correct focus, opportunities are missed and the vendor demonstrates a lack of understanding of the customer needs.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Two weeks after optimizing its object-based storage software, Amplidata is making its appliance denser to hold more data and use less power.
The Amplistor XT Storage System now supports 3 TB SATA drives in its new AS30 module, allowing it to hold 30 TB in a 1U box and scale to 1.2 PB in a rack with 40 modules. The AS30 will eventually replace Amplidata’s AS20, which holds 2 TB drives and 20 TB in one appliance.
Amplidata claims the AS30 uses about 30 percent less power than the AS20, requiring 2.2 watts per terabyte when idle and 3.3 watts per terabyte when in use. That’s about the same power of a 60-watt light bulb for the entire 30 TB module.
“The really big thing is the power consumption is just over 65 watts, when powered and idle with no disk activity,” said Paul Speciale, Amplidata’s VP of products. “When there is activity, it consumes 3.3 watts per terabyte. But just because of the low performance, these systems can go tens of gigabytes per system so you are not giving up on performance.”
Amplidata’s storage platform is designed for cloud archiving of media and entertainment files, and “big data” file storage. Amplidata sees the media and entertainment industry as a key target for the larger drives.
Randy Kerns, senior strategist at Evaluator Group, said erasure code-based technology becomes more important with higher capacity drives because there is a greater probability of drive failures in the larger drives.
“As you get to higher capacity drives, you have a greater exposure to a second drive failure and rebuild times are longer,” Kerns said. “With that exposure, the probability goes up. Two terabyte drives typically take eight hours to rebuild in a normal system, so it becomes more important when you go to three or four terabyte drives in a multi-petabyte system because you have a higher probability of a problem happening. Media and entertainment is very sensitive to these issues and Amplilidata is targeting that market.”
Amplidata’s AS30 has a starting price of under $0.60 per Gigabyte.
Oracle has introduced its StorageTek Tape Analytics, a software that monitors, manages the health and proactively captures the performance of StorageTek libraries that are located across the world from a single pane of glass.
The software resides outside the library, in a dedicated server database and captures library, drive and media health metrics through an out-of-band process so that tape drives are never taken offline to collect the data. The metrics information is sent to a central collection point, where the data is analyzed for potential problems that could cause errors in the media. It looks for capacity limitations, intrusion entry points while also providing administrators with recommendations to help prevent data loss. The software is built on the Oracle Fusion middleware code base.
Oracle executives said the StorageTek Tape Analytics does granular drill downs into the health specifics of drives and the media. The software connects to each library through a single Gigabit Ethernet connection. StorageTek tape libraries use the SNMP protocol to pass drive and media health information directly to the analytics software through a dedicated IP port. The software can pull more than 100 attributes from the drives.
“We get all the informtion directly from the library and it’s done from the control path not the data path,” said Scott Allen, an Oracle senior product manager. “This offers a more secure approach.”
One of the ongoing changes in IT is the transition to IT generalists configuring and managing storage in all but the largest enterprises. This was always common in small enterprises, but now is increasingly the case in the mid-tier enterprise, too. Beyond storage, the IT generalist handles server operating systems, networking, and the virtualization hypervisors.
Another dynamic occurring along these lines is called the consumerization of IT. People that use technology such as smart phones or iPads in their daily lives are becoming administrators at the IT generalist level. The general consumer technology user:
· Must know about setting up accounts and security.
· Understand how to protect data in the cloud.
· Know how to migrate data to a new device.
· Understand about setting file sharing options such as access to photos on Snapfish.
What has happened here? IT operations have become part of many people’s lives. Most are doing these administrative tasks out of necessity with no training other than some interactive guidance. Some do it incorrectly, some struggle through the administration, and others provide services – in my case, I’m the admin for the PCs, etc. for my daughters.
This shift even changes the way midrange enterprise storage is managed. Element managers (the storage vendor’s storage system management software) must be designed with expectations that an IT generalist will manage the storage environment.
Storage vendors should assume the IT generalist using the element manager has a limited base of storage knowledge. They should expect that no manual will be read either on paper or in electronic form. And they should realize that when there is a complex set of choices, they should assume the wrong one will be tried first and correction action or second chances will be necessary.
This leads to the demand for a new GUI that is highly interactive with icons to demonstrate actions and status. The GUI must seem simple, belying the underlying complexity.
Without a plan and real education, we’ve created a mass unpaid workforce of IT generalists. So, when do we get a new generation of storage administrators without planning for it?
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).