Posted by: Dave Raffo
data deduplication, dedupe for flash, flash storage arrays, primary deduplication
It’s been a slow news year for data deduplication. The data reduction technology has yet to make its big splash for primary storage and is taken for granted for backup. But things picked up this week as EMC Data Domain, FalconStor, Hitachi Data Systems and Permabit all either expanded their dedupe products or talked about their plans.
Permabit aims dedupe software at flash arrays
With the adoption rate of dedupe for primary storage slower than anticipated, Permabit this week unveiled Albireo for Flash Technologies, which is really a flashy way of saying it supports solid-state storage with its Albireo Software Development Kit (SDK) and Virtual Data Optimizer (VDO) for Linux.
Permabit does not sell Albireo software directly, but makes its SDK and VDO available for OEM partners.
Permabit founder and CTO Jered Floyd says primary dedupe adoption is slow because the large established storage vendors resist the notion of cutting into disk sales by shrinking data. (The large vendors dispute this, and all have or are working on some type of dedupe for primary data). Floyd maintains the benefits and needs for primary dedupe for flash are greater than for disk arrays, and the startups selling flash systems are more open to incorporating dedupe.
“We believe dedupe will be a basic required feature for any flash platform,” he said. “Permabit makes it so these companies building new flash platforms can easily and rapidly integrate dedupe.
Does dedupe have to be different for data on flash than hard disk? Floyd said there are benefits and challenges for dedupe on flash that goes beyond dedupe on hard drives. He said dedupe can not only significantly lower the cost per gigabyte of flash but also help improve latency and reliability and avoid wear by reducing the number of writes on a system. Floyd claims Albireo can meet the high demands of flash by handling more than 250,000 IOPS on a single core processor.
Permabit CEO Tom Cook said “a handful” of flash vendors are involved in the early access program for Albireo and he has commitments form a few. He expects to announce deals in the second half of the year.
It will be interesting to see who signs up for Albireo. All-flash startups such as Nimbus Data, Greenbytes, Pure Storage, SolidFire, and XtremIO have dedupe or are promising it for when they begin shipping. Does that mean the market for Albireo is smaller than Permabit anticipates?
“It would be a mistake to assume we’re not working with vendors who have announced dedupe but have not yet delivered,” Floyd said. “Not having dedupe in a flash storage system is going to be a huge liability.”
HDS prepares primary dedupe appliance
Hitachi Data Systems is planning primary data reduction for its newly released Hitachi Unified Storage, as well as a deduplication appliance, according to Fred Oh, HDS’ senior product marketing manager for NAS. He said data reduction for the file portion of the HUS will be available this year and the appliance is expected in the summer. Oh wouldn’t say if HDS is using technology from Permabit, which had an OEM deal with NAS vendor BlueArc before HDS acquired BlueArc.
FalconStor provides inline dedupe option
FalconStor added inline dedupe to its virtual tape library (VTL) product, FalconStor VTL 7.5. FalconStor now supports inline, concurrent and post-processing dedupe as well as its Turbo dedupe option for post-processing.
In the early days of dedupe, the inline versus post-process issue was hotly debated. Inline requires less disk capacity on the back end because it reduces data before moving it to the backup target. Post-processing dedupes at the target, so it requires more capacity but is usually the faster method. Faster processors have alleviated inline dedupe speed concerns, and some of the early post-processing advocates have added an inline option or switched from post-processing to inline.
FalconStor claims its dedupe options are the most flexible.
“We added inline dedupe as a fourth choice,” said Darrell Riddle, FalconStor senior director of product marketing. “We see it as a good fit for smaller systems or systems that need more power up front.”
For a four-node VTL cluster, FalconStor claims its inline dedupe can handle more than 28 TB per hour and post-processing dedupe can back up more than 40 TB per hour.
FalconStor’s concurrent dedupe runs post-process, but does not wait until all backups are completed before deduping on the back end. Riddle said FalconStor VTL customers can also turn off dedupe if they have little or no compressable data.
FalconStor VTL 7.5 software costs from $2,500 to $4,500 per terabyte under management, depending on the configuration.
EMC gives Oracle RMAN a DD Boost