Compression of data on primary storage has taken center stage in the storage wars now with IBM’s release of Real-Time Compression on the Storewize V7000 and the SAN Volume Controller.
Although not the first product to offer data reduction in primary storage, IBM raised the bar by doing compression inline (real-time) and without performance impact. Other solutions in the open systems storage area primarily compress data and sometimes dedupe it as a post-processing task after the data has been written.
Competition for storage business is intense, and inline compression of data for primary storage will be a major competitive area because of the economic value it brings customers. If the compression can effectively reduce the amount of data stored, the reduction amount serves as a multiplier to the amount of capacity that was purchased.
IBM claims a 5x capacity improvement, which gives customers five times as much capacity as they pay for. Even if IBM’s compression comes in at 2x, that would still be significant savings despite an additional license fee for the feature.
Doing compression with no performance impact means the compression is transparent to the application and server operating system. The customer gets increased capacity benefits without having to make an accommodation such as installing another driver or version of an application. The effective compression rate will vary with data types, but there has been a long history of compressing data and the types and compression rates are not a new science. Vendors usually publish an expected average and sometimes offer a guarantee associated with the purchase.
Compression of real-time data in the mainframe world goes back to the StorageTek Iceberg (later offered as the IBM Ramac Virtual Array) that compressed mainframe count-key-data in the 1990s. That system compressed data at the channel interface and then stored the compressed information on disk.
The use of the Log Structured File system and the intelligence in the embedded storage software allowed the system to manage the variable amount of compressed data (done on a per-track level), and removed the direct mapping to a physical location. That was an effective compression implementation and demonstrated the effect that compression multiplies the actual capacity.
One of the more significant aspects of compressing data at the interface level was the effect that had on the rest of the system resources. With data that was reduced by something like 5x or 6x, the other resources in the system benefited.
• The cache capacity was effectively multiplied by that same amount, allowing for more data to be resident in cache giving higher hit ratios on reads and greater opportunity for write coalescing.
• The interface to the device had the data transfer bandwidth effectively multiplied for much faster transfer of data from the disk drive buffers.
• The disk devices, while storing more data, also would transfer more data over a period of time to the disk buffers and the controller.
Similar benefits gained by the implementation in the StorageTek system can be achieved in new systems targeted for primary storage in open systems.
In the case of the StorageTek system, the compression was a hardware-intensive implementation on the channel interface card. With IBM’s Storewize V7000 and SVC, the implementation is done in software, capitalizing on the multi-core processors available in the storage systems. Faster processors with more cores in succeeding generations should provide additional improvement. Having compressed data in cache and compressed data transferred on the device level interface and from the device means performance gains there offset time spent in the compression algorithm.
There are other potential areas where transparent compression could be done. Compressing the data in the device such as in the controller for solid state technology is another option.
Customers will benefit from reduction of data actually stored and the inline compression of data that is transparent to operations. The benefits are in the economics and this will be a competitive area for vendors.
There will be a considerable number of claims regarding implementations until this becomes a standard capability across storage systems from a majority of vendors. You can expect a rush to bring competitive solutions to market.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
EMC’s backup and recovery team says Hewlett-Packard is playing games with its numbers in claiming its B6200 backup system with StoreOnce Cataylst software is significantly faster than EMC Data Domain arrays with DD Boost.
HP said its StoreOnce B6200 disk target with Cataylst can ingest data at 100 TB per hour with the maximum of four two-node pairs, compared to EMC’s claim of 31 TB/hour with its new Data Domain DD990 with DD Boost. However, the B6200’s nodes are siloed. That means an eight-node system actually consists of four separate pools, and it would take an aggregate performance to get to 100 TB/hour.
In an email, an EMC backup/recovery spokesman pointed out the DD990 would achieve 620 TB/per hour if measured the same way that HP measures performance. EMC’s 31 TB/hour claim is for a single storage pool but 20 pools can be managed from one Data Domain Enterprise Manager console.
According to EMC’s e-mail, “As lofty as they sometimes seem, we do make a concerted effort to keep our performance claims reasonable and defensible. This announcement by HP was, by contrast, very much a smoke and mirrors effort.”
The truth is that all vendor performance claims – including benchmarks – should be taken with a grain of salt because they are achieved in optimal conditions, often with hardware configurations that would bring the price up considerably. A smart backup admin knows that performance will vary, and these vendor claims need to be verified in real-world tests.
For all of its talk about smart storage this week at IBM Edge, Big Blue’s storage announcements amounted to mostly cosmetic changes. The lone exception was the addition of real-time inline compression for primary storage arrays.
IBM ported the Random Access Compression Engine (RACE) technology acquired from Storwize in 2010 into its Storwize V7000 and SAN Volume Controller (SVC) virtual storage arrays. This is IBM’s first integration of the compression technology into SAN arrays.
Until now, IBM used the technology only in its Real-Time Compression Appliances, which were re-branded boxes that Storwize sold before the acquisition. Even the Storwize V7000 launched in late 2010 lacked compression, despite its name.
Now IBM is claiming it can compress active primary data with no performance impact on SVC and Storwize V7000 storage, and says it can reduce primary data accessed via block-based protocols by up to 80%.
It turns out that integrating data reduction into primary storage isn’t easy. Dell bought primary deduplication startup Ocarina around the same time that IBM picked up Storwize, and has yet to port primary dedupe onto its Compellent or EqualLogic SAN arrays. Dell did launch a backup appliance using Ocarina dedupe in January, and may have a primary data dedupe announcement next week at its Storage Forum.
Other IBM enhancements include support for Fibre Channel over Ethernet (FCoE) and non-disruptive volume moves between I/O groups for SVC and Storwize V7000, and four-way clustering for Storwize V7000.
IBM added thin provisioning and Enhanced FlashCopy (allows for more snapshots) for DS3500 and S3700 midrange arrays, and a new web-based UI for the IBM Tivoli Storage Productivity Center (TPC) suite. For tape management, it added IBM Tape System Library Manager (TSLM) software that helps manage multiple libraries, and an IBM Linear Tape File System (LTFS) Storage Manager for customers using LTO-5 tape libraries and IBM’s LTFS Library Edition.
IBM also said it plans to extend its Easy Tier automated tiering software to direct attached server-based solid-state drives (SSDs) so customers can migrate data between disk systems and servers.
After seven years of partnering with Montreal-based Watch4net, EMC this week bought the software company to bolster its IT infrastructure management capabilities.
Watch4net describes its APG software as “a carrier-class performance management application that provides real-time, historical and projected visibility into the performance of the network, data centers and cloud infrastructures.”
EMC resold Watch4net software, and the software is already integrated into the EMC IT Operations Intelligence (ITOI) Suite. ITOI provides availability management, correlation and root-cause analysis for storage, networks and compute resources. Watch4net merges performance metrics from ITOI into custom reports and provides ITOI with alert information when performance thresholds are exceeded.
The acquisition gives EMC greater control over Watch4net’s intellectual property. Watch4net CEO Michel Foix and most of the company’s 70 employees will join EMC as part of its Infrastructure Management Group. EMC considers an expansion of its infrastructure management software a key part of its move to provide cloud services.
Flash storage vendor GreenBytes closed a $12 million funding round this week, led by Al Gore’s venture capital firm.
GreenBytes sells two platforms of hybrid arrays combining solid-state drives with hard drives, and in February launched Solidarity – an all-SSD drive with a starting price of under $100,000 for 13.4 TB. The startup said it will use its second funding round to expand sales, marketing and channel development.
Generation Investment Management LLP, co-founded by Al Gore in 2004, led the round with a contribution from Battery Ventures and GreenBytes management. Former U.S. vice president Gore is chairman of Generation Investement, which claims to make investments based on a company’s economic, environmental, social and governance sustainability factors.
That means it’s probably more interested in the green than the bytes with its new investment. Flash vendors say they are green because their systems have a smaller footprint and use less power than spinning disk storage.
Regardless of their environmental impact, flash array vendors are pulling in the green. EMC reportedly paid $430 million for XtremIO this month, and Violin Memory, Whiptail and Starboard Storage have closed funding rounds this year.
Last week’s EMC World would have to be viewed as a major success for EMC. There were customers, press, analysts, resellers, and even other vendors there with something around 20,000 people – counting EMC employees. The crowd was so large, getting through the corridors of the Venetian was a moving body rub.
The access to EMC executives and staff was a real credit to the event. They provided information and fielded questions and did not just “make an appearance” and then bolt. EMC technical people were available as well, and many attendees took advantage to ask about usage in specific environments. For analysts, there was a program to meet with executives and product owners and find about directions and new capabilities.
There has certainly been a sea change in storage events over the years. Meeting with vendors and hearing about their products or their latest announcements was done at large, multi-vendor events such as Storage Networking World (SNW) in the past. Now, the information is more available at the vendor events such as EMC World.
Most vendors have also changed their approach to releasing information. The vendor events are when the next generation of products or features are announced, and future capabilities previewed. There was a time when vendors rarely pre-announced products or capabilities. The announcements came when the products were available.
That practice was relaxed somewhat to coincide with major industry events and products or features were announced that were going to be available within the next quarter. At the major vendor events now, the announcements may be for features or products that will come out over the next three quarters. This is a much longer view and has both positive aspects such as creating interest and publicity for the vendor and negatives in that it may freeze purchases while customers wait for future releases.
EMC uses its show as what the vendor calls a “mega launch,” with significant announcements or releases involving most of its key products. It creates interest and has turned EMC World into a must-attend event where the amount of information is so great that not attending will leave people feeling they might be missing valuable information. Certainly that is the intention and it works well. The result may be that the industry-wide events such as SNW have less new information and their importance has diminished in the minds of many.
As an analyst, access to the executives and product people along with the explanations about the announcements are incredibly valuable. It is also a great chance to catch up with friends that you’ve worked with in the past.
The economics for these mega events are well understood by marketing – they cost a great deal and must result in increased or sustaining revenue to justify the investment. EMC has certainly set the bar for these shows, and we can expect more of the same from their rivals in the storage arms race.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Like EMC, NetApp executives say they expect to offer flash in many areas of their storage systems. But NetApp so far only sells flash as a cache in storage arrays (Flash Cache), and solid state drives (SSDs) that can complement hard drives in a hybrid approach.
EMC has a VFCache server-side PCIe card, and its roadmap includes a PCIe-based shared storage appliance and all-flash array — all pieces NetApp lacks. And while EMC predicts a hybrid implementation of SSD and hard drives in the same system will be the most popular, NetApp maintains Flash Cache is the best way to go. NetApp is also more guarded about its flash roadmap than EMC.
“We have consistently said that flash will be deployed throughout the hardware stack, and our flash offerings will be aggressive and multi-faceted,” NetApp CEO Tom Georgens said on the vendor’s earnings call this week. “Our belief is that offering flash primarily in the form of the cache is the most efficient and effective way to deploy this technology in storage arrays, and most of the industry is now following this approach.”
Georgens said all NetApp FAS6000 systems have 500 GB of flash embedded, and most FAS3240 and3270 arrays ship with flash. “Our R&D pipeline contains projects to further the use of flash on other layers of the stack, and our next release of OnTap [NetApp’s operating system] will contain additional flash-related offerings,” he said.
NetApp executives visited XtremIO’s Israeli headquarters before EMC grabbed the all-flash array startup for $430 million. An analyst on the call suggested EMC outbid NetApp for XtremIO as it did for Data Domain three years ago, but Georgens would not confirm that NetApp wanted XtremIO.
“Obviously, we spend time with a lot of people and visit a lot of people, talk to a lot of people about potential engagements,” he said. “As far as flash goes, I see that as an innovation and a way to promulgate our data management capability. And that’s going to be the key part of our strategy. So I think you’ll see NetApp participating in flash on multiple dimensions, and primarily, it is to expand our data management footprint.”
Hewlett-Packard’s storage team maintains it remains committed to the EVA midrange platform, and continues to upgrade the product. But you have to wonder how long that commitment will remain as the company faces restructuring and EVA sales keep dropping while 3PAR’s soar.
HP CEO Meg Whitman characterized HP’s storage as “a tale of two cities” during the company’s earnings call Wednesday. “3PAR continued to gain strong traction in the marketplace, growing more than 100 percent year-over-year,” she said. “At the same time, tape and EVA are declining. This is an anticipated product transition and we’re effectively managing the shift to our next-generation storage arrays.”
This trend has been ongoing ever since HP acquired 3PAR in late 2010. HP CFO Cathy Lesjak added on the call that 3PAR has replaced EVA as the vendor’s largest revenue storage array. She said the two midrange platforms grew a combined 19% year-over-year compared to one percent overall storage growth. That means without 3PAR, HP’s storage revenue declined.
HP said its StoreOnce deduplication software nearly doubled in revenue, but that product had little revenue a year ago before it was relaunched last November.
LAS VEGAS, Nev. — EMC was stingy with details of its pet flash projects during most of EMC World 2012, but offered glimpses of early versions of “Project X” and “Project Thunder” Tuesday during chief marketing officer Jeremy Burton’s keynote on future technologies.
Burton called up XtremIO’s product manager VP Josh Goldstein for a quick demo of what EMC is calling Project X – XtremIO’s all-flash array. Goldstein showed how the flash array scales out by adding nodes for “unlimited IOPS,” said it takes only 20 seconds to provision 1 PB of data, and claims a single node can produce 150,000 write IOPS and more than 300,000 read IOPS. He said it can handle a mixed load of reads and writes at 180,000 IOPS. Goldstein said the product is made up of commodity components to keep the price down. EMC execs say Project X is entering pre-beta deployments and will not be generally available until next year.
Project Thunder is an appliance with up to 10 1 TB flash cards and connects to servers through 40-Gigabit Ethernet or InfiniBand switches. Dan Cobb, CTO of EMC’s flash, gave a brief demo on stage and strongly hinted that Thunder will be shipping by the end of the year. “[EMC COO] Pat Gelsiner would like an early Christmas present, and he seems like he’s the kind of guy who gets what he wants,” Cobb said. …
Even with Projects Thunder and X, EMC expects most flash implementations in the near future to be the addition of SSDs to hard drive arrays. Or, as the president of EMC’s unified storage division Rich Napolitano put it, “A little flash goes a long way.”
Napolitano said if a storage array has five percent of flash, that flash can handle 70% to 80% of all IOPS. He also said more than 60% of EMC’s midrange storage systems now include flash, and claimed that all-flash systems are too costly for midrange systems. “Ninety percent of use cases will be covered with a hybrid array,” he said. “The challenge with all-flash arrays is, if they’re not hybrid, they just can’t hit the cost point for the midrange.” …
Syncplicity customers hoping that EMC will improve the online file-sharing service’s enterprise features might get their wish. Jeetu Patel, chief strategy officer for EMC’s information intelligence group, said there are plans to add enterprise features.
“We looked at many products in this space and found that Syncplicity was the best at providing a simpler user experience with the richest set of security, policy and management controls for IT,” Patel said. “It is already enterprise-strength and we have numerous exciting enhancements planned.”
Patel said recent customer surveys indicated more than 80% want an enterprise solution for syncing and sharing files. “It is critical that our solution breaks down the barriers between operating systems, devices, and business apps to allow data to reside everywhere users need it,” he said.
Patel said there are no pricing changes for Syncplicity planned “at this time.” …
EMC CEO Joe Tucci said the vendor will continue to split its product development resources between internal development and acquisitions. He said EMC spends about $2 billion a year each on R&D and M&A.
“We never comment on what technology areas we’re interested in acquiring, that only ends up working against us,” he said during a Monday press conference. “We will do multiple acquisitions a year. XtremIO will not be the last one we’ll do this year.”
He was proved correct on that last claim hours later when EMC announced it acquired Syncplicity. …
Facebook is in the news for other reasons these days, but the social media giant may have set a record for the largest storage purchase in history earlier this year. Computer Weekly’s Jennifer Scott reports Facebook is the customer EMC CFO David Goulden referred to during the company’s earnings call last month when he said a web company bought 28 PB of Isilon storage. “We believe it was the largest capacity single-order in the history of storage,” Goulden said. EMC execs have refused to publicly name the customer. …
A representative for cloud service provider Nirvanix claims EMC’s Atmos upgrades this week shows the vendor still doesn’t understand the cloud storage concept. Steve Zivanic, Nirvanix’s VP of marketing, dashed off an email mocking EMC for thinking speed enhancements to its Atmos object-storage platform will help customers who want cloud storage.
Zivanic wrote the Atmos upgrade shows that EMC still views storage as a product instead of a service. “When it’s a storage product, the risk is all on the buyer,” Zivanic wrote. “When it’s a service, the risk is on the seller. Atmos is still a product customers have to buy and manage and operate and maintain, versus a cloud service they can simply access as required.”
Zivanic wrote that customers have to pay up to millions of dollars to build private clouds using EMC technology, and its talk of hybrid clouds is hollow because the vendor has no public cloud of its own. That means it cannot enforce SLAs and security policies between a private and public cloud in a hybrid setup.
“EMC is still stuck in the old paradigm,” he wrote. “Once a customer uploads data to the cloud, it’s the last data migration they’ll ever do. That’s the new paradigm that EMC refuses to fundamentally embrace. It doesn’t matter how much faster Atmos is, EMC is still fighting the disruptive change that cloud storage represents.”
EMC’s new Federated Tiered Storage (FTS) for VMAX arrays allows customers to run other EMC platforms or competing storage systems behind VMAX, much like Hitachi Data Systems (HDS) has virtualized arrays for years behind its Universal Storage Platform (USP) and current Virtual Storage Platform (VSP) systems.
Not surprisingly, EMC claims its virtualization features go beyond those of Hitachi’s, and HDS claims EMC is off base with those claims.
Brian Gallagher, EMC president of enterprise storage, made the case for EMC’s virtualization Monday during the opening of EMC World.
“We’ve extended Symmetrix’s data integrity to non-EMC devices. Hitachi does not do that,” Gallagher said. “Also, our technology is free of charge. You can virtualize any amount of non-EMC storage behind Symmetrix. Hitachi gives you a certain amount of terabytes for free, and then they charge when you go beyond that. Hitachi will also tell you not to use virtualization for databases, we don’t say that. We also extend FAST to other arrays. They [Hitachi] don’t extend auto tiering [to VSP].”
Claus Mikkelsen, chief scientist at HDS, disputed EMC’s points in an e-mail to Storage Soup. He said the HDS VSP supports “full inheritance” on externally virtualized storage devices from HDS or other vendors.
As for EMC’s free claim, Mikkelsen said “EMC states that software license enablement of FTS is a no-charge feature to customers, but fails to mention the future impact on software maintenance costs for the FTS license and any other EMC software license that charges maintenance based on installed capacity. With the Switch It On program from HDS, virtualization is free and third-party capacity is deeply discounted.”
Mikkelsen said HDS offers “prudent” advice on using virtualization with databases.
“We state that that Hitachi Dynamic Tiering on VSP will intelligently place data pages based on an application’s I/O access pattern,” he said. “Additionally, we recommend that customers do not immediately place OLTP database environments with high I/O transaction rates and low average response time requirements as externally virtualized storage. This is prudent advice that any intelligent storage vendor would recommend.”
Finally, Mikkelson said HDS’ technology does extend its Dynamic Tiering to third-party virtualized arrays.
EMC’s FTS will require time for users to kick the tires before it can be accurately judged, but the virtualization features are welcome additions.
“This is at least a step down the path to VSP-style virtualization,” said Ray Lucchesi, president of Silverton Consulting.
You can expect more steps – and more spats with HDS – before EMC’s array virtualization story is finished.