I’ve attended two conferences recently where a speaker talked about storage efficiency and the growing capacity demand problem. The speaker said that a part of the problem is we don’t throw data away. That blunt statement suggests that we should throw data away. Unfortunately, that was the end of the discussion and the rest was promotion of a product.
This really begs the question, “Why don’t we delete data when we don’t need it anymore?” When I asked this question to IT people, they had several reasons for keeping data.
Government regulation was the most common reason. Many of these regulations are in regard to email and associated with corporate accountability. People in vertical markets such as bio-pharmaceuticals and healthcare have extra industry-specific retention requirements.
Business policy was another top reason for not deleting information. There were three underlying reasons for this category. In some cases, the corporate counsel had not examined the information being retained and had issued orders to keep everything until a policy was developed. Others keep data because their executives feel the information represents business or organization records with future value. (It was not really clear what this meant.). In other cases, IT staff was operating off a policy written when records were still primarily on paper and had not received new direction for digital retention.
Another common response was that IT staff had no time to manage the data and make retention decisions or to involve other groups within the organization. In this case, it is simpler to keep data rather than make decisions and take on the task of implementing a policy.
The other reason was probably more of a personal response – some people are pack rats for data and keep everything. I call this data hoarding.
Rather than only listing the problems, the discussion about data retention should always include ways to address the situation. Data retention really is a project. To be done effectively, it usually requires outside assistance and the purchase of software tools. In every case, an initiative must be undertaken. This includes calculating ROI based on the payback in capacity made available and reduced data protection costs. The project requires someone from IT to:
• Understand government regulations. Most are specific about the type of data and circumstances, and almost all of the regulations have a specific time limit or condition for when the data can be deleted.
• Examine the current business policies and update them with current information from executives and corporate counsel. Present the costs of retaining the data along with the magnitude and growth demands as part of the need to review the business policies.
• Add system tools to examine data, move it based on value or time, and delete it when thresholds or conditions are met.
• Get a grip. Data hoarding is costing money and making a mess. The person who replaces the data hoarder has to clean it up.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Backup Exec 2012, designed for SMBs and Windows-based backups, will support the V-Ray features already integrated into Symantec’s NetBackup enterprise application. Backup Exec 2012 and Backup Exec 2012 Small Business Edition betas have gone out to 45,0000 Symantec registered global partners.
Symantec has generated a lot of noise around its V-Ray technology, which makes all of its products better optimized for virtual machines. For Backup Exec 2012, V-Ray lets Symantec customers do a physical-to-virtual conversion. This lets an administrator take a backup copy and convert it from a physical machine to a virtual machine for disaster recovery. Organizations will be able to recover a failed system to a physical server or to a Hyper-V or VM guest.
“The backup to the VM can be done in parallel, in sequence or serially,” said Aidian Finley, senior manager for product marketing for Symantec’s Information Management Group. :”The administrator has the choice of when to do that conversion. We call it ‘no hardware physical recovery.'”
The Backup Exec administration console has a new interface so administrators can automatically configure common policies and settings for quicker configuration and data protection.
Backup Exec can be purchased as an appliance, a software application or as a cloud service.
The rollout is the largest partner-only beta program Symantec has ever put into place. The idea of providing the beta product to partners is to give them a chance to shape the application, said Arya Barirani, senior director of product marketing for Symantec’s Information Management Group.
“It’s really a way [for partners and customers] to get their hands on this product and test it,” Barirani said.
EMC Corp. is building up its training and certifications program by adding new levels of curricula for IT professionals who want to deepen their skills in the cloud, virtualization and data analytics.
The new courses fall under the umbrella of the EMC Proven Professional Training and Certification program. EMC launched a Cloud Architect program last January for IT professionals with a broad and deep understanding of server, storage, security, networking and business application disciplines but want to get better skills for handling virtualization and cloud computing. This course would likely appeal to storage architects.
Now EMC is adding EMC Cloud Infrastructure and Services and the EMC Cloud Architect IT-as-a-Service training and certifications. The first one is targeted for individuals who manage storage, networking or security as part of a team implementing and managing a cloud infrastructure. This is aimed more for the IT specialist rather than a storage architect.
Those who have taken the Cloud Architect Training and Certification program can advance their skills by taking the EMC Cloud Architect IT-as-a-Service course. Participants learn how to create service catalogs and self-service portals. Both of these programs have been open since October.
“The (Cloud Architect) certification program was for those with a high-level skill set,” said Chuck Hollis, EMC’s global marketing CTO. “It’s a week-long course but you had to show prerequisites that were very high. You get a deep level lab experience. You come to our facility and you get expert level experience in building a cloud architecture. For the new certifications, the prerequisites are a bit lower. You are working with a team that builds a cloud and you see how it affects you.”
EMC will also offer a foundational data science and big data analytics training and certification in January 2012. It’s a week-long, associate-level course where participants learn how to use big data and analyze it to help make informed business decisions. “Most business leaders realize it’s no longer your father’s business intelligence any longer,” Hollis said.
These training and certification programs are part of EMC’s Education Services. The training can be taken three ways, based on customer needs. IT professionals can attend an Instructor-led training course given at an EMC training center. These courses are offered frequently based on demand in more than 70 locations globally. Also, EMC can send an instructor to customers’ locations for when there are big teams that need training quickly. The last way to obtain the training is via a video instructor provided through a DVD.
Storage software vendor Aptare is branching out to managing file storage. The vendor today launched Aptare StorageConsole File Analytics, which helps discover and analyze unstructured data. Or, as Aptare CEO Rick Clark puts it, now Aptare “tames the beast of big data.”
The File Analytics app joins the Aptare Storage Console suite, which previously included Backup Manager for backup reporting, Replication Manager, Storage Capacity Manager, Fabric Manager and Virtualization Manager for managing storage in virtual environments. The suite apps are available as standalone products or part of a package that can be managed through a common console. Hitachi Data Systems sells Aptare software through an OEM agreement, and NetApp licenses the Aptare Catalog Engine as part of its snapshot technology.
Aptare’s File Analytics app collects and aggregates unstructured data on storage arrays. Clark said instead of walking the file system for information – which can be a resource intensive process – File Analytics runs without agents for minimal impact on the storage system. It uses a compressed database developed by Aptare code-named “Bantam,” that Clark said can store more than 100 million records in less than 1.5 GB. File Analytics analyzes metada in Bantam, and uses that metada for storage tiering, compliance and to recognize and eliminate duplicate files.
HDS hasn’t disclosed that it will sell File Analytics yet, but Clark said it would be a good fit with HDS’ recently acquired BlueArc NAS platform as well as its Hitachi Content Platform (HCP) object-storage system marketed for cloud storage.
“This allows you to figure out which data to put into those respective platforms,” Clark said. “The first thing a customer asks is, ‘What should I put into those platforms?’ It can also be used to determine which information to move to the cloud.”
Aptare can use a friend like HDS to help sell File Analytics. The product will go head-to-head with Symantec Data Insight for Storage, as well as capabilities included in EMC Ionix ControlCenter, Hewlett-Packard Storage Essentials and Solar Winds Storage Manager.
Archiving and storage and device management pushed storage software revenue up 9.7% last quarter over the third quarter of 2010, according to IDC.
IDC said archiving software revenue grew 12.2% and storage and device management increased 11.3% over last year. Data protection and recovery software is still the most popular storage software, with 34.9% of the market. Backup software giants EMC and Symantec were the overall storage software leaders, with EMC generating $847 million and Symantec $530 million. EMC had a 24.5% share of the $3.46 billion market, followed by Symantec with 15.3%, IBM with 14%, NetApp with 8.8% and Hitachi Data Systems (HDS) with 4.4%. HDS grew the most since last year with a 15.3% increase. EMC increased 10.3%, and the other three vendors in the top five lost market share. IBM grew 8.8% since last year, Symantec grew 2.2% and NetApp grew 0.3%.
Customers apparently feel more comfortable buying storage software from smaller vendors than they do buying storage systems. “Others” – those not in the top five – combine for $1.15 billion in storage software revenue last quarter. That was up 15.3% over last year and made up 33.2% of the market – more than any single vendor.
On the hardware side, external disk storage system revenue from “others” dropped 5.2% over last year, according to IDC’s report released last week. The others in hardware had only 18.5% of the market, well below leader EMC’s 28.6% share. Total storage system revenue increased 10.8%, slightly outgrowing the rate of storage software sales.
EMC solidified its lead as the No. 1 external disk storage vendor last quarter, according to the latest IDC worldwide disk storage systems quarterly tracker.
EMC increased its networked storage revenue by 22% from a year ago, more than doubling the industry year-over-year growth of 10.8%, EMC posted third-quarter revenue of $1.65 billion – up from $1.35 billion a year ago – and increased its market share from 25.9% to 28.6%. IBM held onto second with $735 million, followed by NetApp at $700 million, Hewlett-Packard at $651 million, Hitachi Data Systems (HDS) with $505 million, and Dell at $459 million. IDC considers IBM and NetApp, and HDS and Dell in a statistical tie because less than 1% of market share separates them.
HDS grew the most year-over-year, increasing 22.1% and increasing market share from eight percent to 8.8%. Dell (down 2.6% after ending its OEM deal with EMC) and IBM (10.2% increase) grew at a slower rate than the overall market. Revenue from all other vendors slipped 5.2%, mainly because three of the biggest “others” – 3PAR, Isilon, and Compellent – were acquired by larger vendors since the third quarter of 2010.
IDC found that the midrange segment (from $50,000 to $150,000) had strong growth. “The trend to buy modular systems offering enterprise level functionality, such as scale-out architectures, tiering, data deduplication, etc. continues,” Amita Potnis, senior research analyst, Storage Systems, said in the IDC release.
IDC also said the SAN market grew 16.1% year-over-year and the NAS market grew 3.5%. EMC led the SAN market with 25.3%, followed by IBM (15.4%) and HP (14%). EMC also led in NAS with 46.7% share following its Isilon acquistion, with NetApp at 30.9%. iSCSI SAN revenue increased 19.5%. Dell, with its EqualLogic product line, led the iSCSI market with 30.3% followed by EMC with 19.2% and HP at 14%.
Server vendors HP, IBM and Dell have higher market shares in the overall disk storage segment, which includes server and direct attached storage. EMC still led the category, although all of its revenue comes from external storage. EMC’s overall disk market share is $21.7. HP is next with $1.436 million (18.9%), followed by IBM with $1.125 million (14.8%), Dell $879 million (11.6%) and NetApp at $700 million (9.2%). HDS did not crack the top five. As with EMC, NetApp and HDS are pure-play external storage vendors and get all their revenue from SAN and NAS systems.
The total disk storage market grew 8.5% to $7.6 billion. The external disk storage revenue was $5.8 billion.
We received a couple of reminders this week about how important backing up virtual machines is in an organization’s data protection strategy.
First, virtual server backup specialist Veeam released Backup & Replication 6. That in itself wasn’t a huge development. Veeam revealed full details of the product back in August, and said it would be shipping by end of year. It even leaked the most important detail – support of Microsoft Hyper-V – six months ago.
The most interesting part of the launch was the reaction it brought from backup king Symantec. Symantec sent an e-mail reminding that it too does virtual backup (through its X-ray technology) and claimed “point products are complicating data protection.” Symantec released a statement saying “In the backup world, two is not better than one. Using disparate point products to backup virtual and physical environments adds complexity and increases management costs … Organizations should look for solutions that unite virtual and physical environments, as well as integrate deduplication, to achieve the greatest ROI.”
Sean Regan, Symantec’s e-Discovery product marking manager, posted a blog extolling Symantec’s ability to protect virtual machines.
In other words, why bother with products such as Veeam and Quest Software’s v-Ranger for virtual machines when Symantec NetBackup and Backup Exec combine virtual and physical backup? But the established backup vendors opened the door for the point products by ignoring virtual backup for too long. Symantec didn’t really get serious about virtual backup until the last year or so.
Randy Dover, IT officer for Cornerstone Community Bank in Chattanooga, Tenn., began using Quest vRanger for virtual server backup last year although his bank had Symantec’s Backup Exec for physical servers. Dover said he would have had to put agents on his virtual machines with Backup Exec and it would have cost considerable more than adding vRanger.
“Before that, we were not backing up virtual machines as far as VMDK files,” he said. “If something happened to a VM, we would have to rebuild it from scratch. That’s not a good scenario, but basically that’s where we were.”
Dover said vRanger has cut replication time and restores for his 31 virtual machines considerably. And he doesn’t mind doing separate backups for virtual and physical servers.
“Using two different products doesn’t concern us as much,” he said. “We generally look for the best performance option instead of having fewer products to manage.”
Quest took a step towards integrating virtual and physical backup last year when it acquired BakBone, adding BakBone’s NetVault physical backup platform to vRanger.
Walter Angerer, Quest’s general manager of data protection, said the vendor plans to deliver a single management console for virtual and physical backups. He said Quest would integrate BakBone’s NetVault platform with vRanger as much as possible. It has already ported NetVault dedupe onto vRanger and is working on doing the same with NetVault’s continuous data protection (CDP).
“We are looking forward to an integrated solution for for virtual, physical and cloud backup,” Angerer said. “I’m not sure if either one will go away, but we will create a new management layer. The plan is to have a single pane of glass for all of our capabilities.”
Discussions for buying storage typically begin with determining the company’s requirements, and usually focus on meeting the needs of business critical applications — also known as tier 1 applications.
As the term implies, these applications are the most critical to an organization. In most cases, downtime or interruption to business critical applications causes a significant negative impact to the company. This negative impact can be financial or an embarrassment that could lead to loss of future business.
When companies quantify the business impact of the loss of critical apps, they usually measure it in financial terms such as a material loss of ‘x’ dollars per hour of unavailability. They also look at longer term impacts, such as the number of customers that will go to a competitor because of the downtime. Not only will that business be lost, but the likelihood of the next transaction going somewhere else impacts future business.
A more jarring measurement that some IT professionals use to explain the justification for a business continuance/disaster recovery strategy is how long of an outage would be impossible to recover from, forcing the company out of business. These numbers vary widely by industry, but they certainly get a lot of attention when measured in days or hours.
Storage is a key element in meeting business critical application availability needs, although the amount of management they require on the storage end varies by application. Requirements for storage systems used for business critical applications start with four key areas:
- Data Protection – The potential data loss due to operational error (from a variety of causes), corruption from the application, or a hardware malfunction is real. A recovery time objective (RTO) and recovery point objective (RPO) need to be established for business critical applications. This will dictate the frequency of protection with the generations retained, the data protection technology needed to meet the time and capacity requirements, and the recovery procedures. The data protection strategy used for a business critical application may be different than secondary -– or Tier 2 — applications.
- Business Continuance / Disaster Recovery – BC/DR is a storage-led implementation where the replication of data on the storage systems is the most fundamental element. A solid BC/DR plan requires storage systems that can provide coherent replication of data to one or more geographically dispersed locations. This capability is necessary to ensure the operational availability of the critical app.
- Security – Secure environments and secure access to information are implied with business critical applications. From a storage standpoint, the control of access to information is an absolute requirement and is not always addressed adequately when developing a storage strategy. Block storage systems protect access through masking and physical connection limitations, moving the security problem to the servers. File storage for unstructured data uses a permissions set that relies on the diligence of administrators and has potential openings that must be addressed with careful consideration. This area will improve as more investments are made in storage for unstructured data.
- Performance – Most of the time, business critical applications demand high performance. For storage, quality of service and service level agreements are defined to meet minimum requirements for operation that do not degrade or impede the application’s execution. These require measurement and monitoring of the storage to determine impacting events and degradations where actions can be taken. Isolating performance issues is a complex task that requires skilled storage administrators with tools that work with the storage systems and networks.
Organizations must give careful consideration to their storage for business critical apps. There needs to be a process for understanding the requirements, evaluating the choices for systems that can meet the requirements, and a strategy for the overall business of storing and protecting information.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
NetApp’s failed attempt to buy Data Domain in 2009 brought a lot of speculation that the storage systems vendor would shift its attention to another backup vendor.
NetApp executives played down the speculation. They said they didn’t need a backup platform, but they wanted Data Domain because its leading position in data deduplication for backup was disruptive and driving strong revenue growth. EMC, which paid $2.1 billion to outbid NetApp for Data Domain, has continued to grow that business despite a plethora of competitors.
NetApp has since made several smaller acquisitions – the largest was LSI’s Engenio systems division – but stayed away from backup. But a few rough quarters have caused NetApp’s stock price to shrink, and now the rumors have returned that it is hunting for backup.
A Bloomberg story today pegged backup software vendor CommVault and disk and tape backup vendor Quantum as the main targets. The story was based more on speculation from Wall Street analysts than sources who said any deals were in the works, but such an acquisition wouldn’t surprise many in the industry.
“I think NetApp needs to acquire companies and technologies, and bring in talent from the outside,” Kaushik Roy, managing director of Wall Street firm Merriman Capital, told Storage Soup.
CommVault and Quantum were among the companies believed to be on NetApp’s shopping list in 2009. A few things have changed. NetApp signed an OEM deal to sell CommVault’s SnapProtect array-based snapshot software earlier this year. That deal is in its early stages. NetApp hasn’t sold much CommVault software yet, but perhaps the partnership is a test run for how much demand there is and could lead to an acquisition.
Quantum was EMC’s dedupe partner before it bought Data Domain. If NetApp bought Quantum in 2009, it could’ve been taken as NetApp picking up EMC’s leftovers. But Quantum has revamped its entire DXi dedupe platform since then, expanded its StorNext archiving platform and acquired virtual server backup startup Pancetera. Those developments could prompt NetApp to take another look.
There are also smaller dedupe vendors out there, most notably Sepaton in the enterprise virtual tape library (VTL) space and ExaGrid in the midrange NAS target market.
However, people who suspect NetApp will make a move expect it will be a big one. CommVault would be the most expensive with a market cap of $1.9 billion and strong enough revenue growth to stand on its own without getting bought. Quantum, which finally showed signs of life in its disk backup business last quarter, has a $524 million market cap but most of its revenue still comes from the low-growth tape business.
Storage technology analyst Arun Taneja of the Taneja Group said buying CommVault would make the most sense if NetApp wants to take on its arch rival EMC in backup. While NetApp was the first vendor to sell deduplication for primary data, it is missing out on the lucrative backup dedupe market.
“NetApp needs to get something going in the data protection side,” Tanjea said. “They’ve missed millions of dollars in the last two years [since EMC bought Data Domain].
“If they want to be full competitors against EMC – and what choice do they have -– CommVault would be better for NetApp to buy. In one fell swoop, CommVault covers a lot of ground against EMC — backup software, dedupe technology at the target and source, and archiving, too.”
Although IT professionals and vendors often think of storage efficiency in different ways, there are usually two main methods of handling it. One is through efficient storage systems that maximize resources. The other is through data management that determines where data is located and how it is protected.
Efficient storage systems control the placement of data within the storage system and the movement of data based upon a set of rules. The systems maximize capacity and performance in several ways:
• Data reduction through data deduplication or compression
• Tiering with intelligent algorithms to move data between physical tiers such as solid state drives (SSDs) and high capacity disk drives
• Caching to maintain a transient copy of highly active data in a high speed cache
• Controlling data placement based on quality of service settings for performance guarantees.
Efficient data management requires dynamically changing the data’s location. This may involve moving data beyond a single storage system. The initial data placement and subsequent movement is based on information about the data that determines its value. This information determines performance needs and frequency of access, data protection requirements including disaster recovery and business continuance demands, and the volume and projected growth of the data. Most importantly, the process takes into account that these factors change over time.
Managing data efficiently presumes that there are classes of storage with different performance and cost attributes, and a variable data protection strategy that can be adapted according to requirements.
When data value changes, it must be moved to a more optimal location with a different set of data protection rules. The movement must be seamless and transparent so the accessing applications are not aware of the location transitions.
Data protection changes must also be transparent so that recovery from a disaster or operational problem always involves the correct copy. Efficient data management must be automated to operate effectively without introducing additional administration costs.
This type of data management existed in the mainframe world for a long time as Data Facility Systems Managed Storage (DFSMS) before moving into open systems.
An interesting area that should be watched closely is migration capabilities built into storage systems that can move data across systems based on policies administrators set up. The IBM Storwize V7000 Active Cloud Engine, Hitachi Data Systems BlueArc Data Migrator and EMC VMAX Federated Live Migration are a few examples of these. The EMC Cloud Tiering Appliance also does this, but is not built into the storage system.
This will be a competitive area because there is great economic value in managing data more efficiently. Watch this area for significant developments in the future.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).