Recent developments point to a change in how we protect the loss of a data element on a failed disk. RAID is the venerable method used to guard against damage from a lost disk, but RAID has limitations – especially with large-capacity drives that can hold terabytes of data. New developments address RAID’s limitations by providing advantages not specific to disk drives.
The new protection technology has been called several things. The name most associated with research done in universities is called information dispersal algorithms, or IDA. Probably the more correct term as it has been implemented is forward error correction, or FEC. Another name used based on implementation details is erasure codes.
The technology can address the loss of a disk drive that RAID was targeted to protect. It can also prevent the loss of a data element when data is distributed across geographically dispersed systems. The following diagram gives an overview of the coverage protection for data elements. The implementation allows for a selection of the amount of coverage of protection across data. An example that is commonly used is a protection setting of 12 of 16, which means only 12 of 16 data elements are needed to recreate data from a lost disk drive.
Vendors with products that use FEC/erasure codes include Amplidata, Cleversafe, and EMC Isilon and Atmos. Each uses a slightly different implementation, but they are all a form of dispersal and error correction.
The main reason to use erasure codes is for protection from multiple failures. This means multiple drives in a disk storage system could fail before data loss would occur. If data is stored at different geographic locations, you can handle having several locations unavailable to respond and still not lose data. This makes erasure codes a good fit for cloud storage.
Other advantages include shorter rebuild times after a data element fails and less performance impact during a rebuild. A disadvantage of erasure codes is they could add latency and require more compute power when making small writes.
One of the most potentially valuable benefits from using erasure codes is the reduction in service costs for disk storage systems. Using a protection ratio that has a long-term coverage probability (meaning multiple failures will not occur with the potential to lose data for a long period of time), a storage system may not require a failed device to be replaced over its economic lifespan. This would reduce the service cost. For a vendor, this reduces the amount of warranty reserve.
This form of data protection is not prevalent today and it will take time before a large number of vendors offer it. There are good reasons for using this type of protection and there are circumstances when it is not the best solution. Storage pros should always consider the value it brings to their environment.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
In case you weren’t paying attention to the storage world for most of 2011, don’t worry, we have you covered.
Our series of stories looking back at the highlights of 2011 will catch you up on what you may have missed:
• Solid state, cloud make mark on storage in 2011
• Hard drive, SSD consolidation highlights 2011 storage acquisitions
• Top cloud storage trends of 2011
• Compensation rose for storage pros in 2011
And if you want to get a jump on 2012, we have your back there too. These look-ahead stories highlight the key storage issues and technologies for the coming year:
• What 2012 has in store for storage
• 2012 preview: more flash with auto tiering, archiving, FCoE
We’ve also gathered useful information to help do your job more efficiently if you work with data:
• Popular storage tips of 2011
• Top data deduplication tips of 2011
• Top remote data replication tips of 2011
• Top disaster recovery outsourcing tips of 2011
• Top SMB backup tips of 2011
I read all the reports on the how the storage industry is doing. These include many segments in storage hardware and software, sometimes going into great detail. These reports often come from data that is self-reported by vendors on how they’ve done in shipping products.
They draw comparisons with the previous quarter, the same quarter of the previous year and through the calendar year. These give us an idea of where we’ve been and how the different segments have fared.
But, these results look in the rear view mirror. They do not tell us how any of these vendors or the industry will do in the future. Determining future performance requires looking out the windshield.
A forecast is usually based on a projection of the trends that have occurred in the past. This indicator is often used in planning and estimating around investments, ordering, staffing, and other elements critical to making business decisions that have tremendous financial implications.
Even forecasts that meant to look through the windshield are usually based on past trends. One technique to project future trends is to look at what occurred in recent years, and assume that pattern will continue. That may be a bad assumption, and bring serious consequences.
Others use surveys to predict the opportunity, but surveys can also mislead. A survey’s accuracy depends on how the questions are asked, and who is responding to them. There is another factor that I can relate to in personal experience: the quality of the answers depends on when the questions are answered. There can be bad days…
I’ve found that conversations with IT professionals lead to a deeper understanding of what their problems are, and what they are doing. With enough of these conversations, a general direction emerges that can be used as guidance in a particular area with much greater confidence. There’s no sure-fire means, however. The best that can be done is to understand the limitations of the input you receive and use multiple inputs.
Another measure for me is gauging what the vendors believe the storage market is doing. This is much easier because the briefings, product launches, and press releases represent investments that are evidence of their belief in the opportunity. Lately, the briefings and announcements have increased – even as approaching the holidays and year-end distractions. Things do look good in the storage world – out through the windshield.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
Many new storage technologies show great promise. They have useful capabilities such as transferring data faster, storing information for less cost, migrating data with less disruption and administration, and utilizing storage resources more efficiently. Technologies get evaluated on the value they bring and are introduced into the mainstream over time.
New technologies also spawn start-up companies with investments from venture capital companies. Invariably there is promotion of the technology, paid for by the developing companies and others with a stake in the technology‘s success. There is even an industry that grows up around educating people on the new technology.
But few new technologies actually have staying power beyond five years or so. This relatively transient technology that had such great promise and promotion often leads to great disappointment, especially if you are involved in a start-up or the support industries. The investors would also lose their investments completely or have them reduced to an asset sale. The technology’s failure also colors the industry for a period, making new investments and career plans more limited.
Everyone in the storage industry seems to have their favorite technologies that crashed and burned. Remember bubble memories? Remember IPI-3? Many of us, me included, invested much of our time (way more than we should have) in developing products in the exciting new technology areas.
There are currently two technology trends to highlight that have staying power and are likely to continue evolving, justifying the investment and commitment. One is solid-state technology used as a storage medium. The other is the use of server technology (multi-core processors, bus technology, interface adapters, etc.) as the foundation for storage systems.
Solid-state usage has created many opportunities with more than one form of storage device. Currently, NAND Flash is the solid-state technology of choice. Eventually, that will be replaced with the next solid state technology, which is expected to be phase-change memory (PCM). The timing of the evolution to the next generation of solid state will probably be determined by the first company that has the technology and wants to achieve a greater competitive position than it has today. Developments in these areas may be held back by the successes and profits from NAND flash. When the price for NAND flash becomes more competitive, there may be more motivation to deliver on the next generation.
For storage systems, the transition from custom designed hardware to use of standard platforms has been underway for some time. The economics of the hardware and development costs have driven most vendors to deliver their unique intellectual property in the form of a storage application that runs on the server-based hardware. The next turn in this technology progression is utilizing the multi-core processors in the storage system more effectively by running I/O intensive applications on the storage system itself.
When a technology is in the early stages, it’s difficult to determine if it will have the staying power to justify the financial and personal investment. But, solid state and server-based storage are past those early stages and have relatively clear paths for the future. What comes next is an interesting conversation, though.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).
One of two pending multibillion dollar hard drive vendor acquisitions closed today when Seagate Technology wrapped up its $1.4 billion transaction with Samsung Electronics.
Seagate is acquiring Samsung’s M8 product line of 2.5-inch high capacity hard drives and will supply disk drives to Samsung for PCs, notebooks and consumer devices. Samsung will supply Seagate with chips for enterprise solidstate drives (SSDs). Seagate already uses those chips for its SSD and hybrid drives. Seagate is also acquiring employees from Samsung’s Korea design center.
The deal was first disclosed in April, but the companies had to clear regulatory hurdles. Even with the close of the deal, Seagate and Samsung face a long transition. Seagate will retain the Samsung brand for some of its hard drives for a year, and establish independent sales, product and development operations during that period.
The other blockbuster hard drive deal in the works is Western Digital’s proposed $4.3 billion takeover of Hitachi Global Storage Technology (HGST). The European Union last month gave its blessing to that deal, but ordered Western Digital to sell off some assets. The Western Digital-HGST acquisition is expected to close next March, one year after it was first announced.
EMC has notified partners and customers that it will raise the list prices of its hard drives by up to 15% beginning Jan. 1 due to shortages caused by Thailand floods. The increases are expected to be temporary, depending on how long it takes damaged hard drive manufacturing plants to recover.
EMC vice chairman Bill Teuber sent an email to customers and partners stating the vendor has eaten price increases so far, but will begin to pass them along to customers after this month. He also wrote that EMC does not expect supply problems because it is the largest vendor of external storage systems, but it has to pay more for the available drives.
“EMC has absorbed the price increases that have been passed on to us and will continue to do so through the end of the month,” Teuber wrote. “Unfortunately we will not be able to sustain that practice. Beginning in Q1 2012 we will be increasing the list price of hard disk drives up to 15% for an indefinite period of time. While we hope that this increase is temporary, at this time we cannot forecast how long the flooding in Thailand will impact HDD [hard disk drive] pricing.”
Another email Teuber sent to EMC personnel said the price increases will be from 5% to 15%. He also wrote the increases will apply to all EMC product lines.
Teuber referred to NetApp indirectly in his email, stating “Many of our competitors have already announced drive shortages and price increases and have stated that this will have a material impact on their ability to hit revenue expectations now and in the future.”
An EMC spokesman today said the vendor would give a full update on the supply chain issue during its earnings call in January.
The shortages are affecting vendors throughout the IT industry. Hard drive vendors Seagate and Western Digital have major manufacturing facilities in the flooded areas. Last month IT research firm IDC forecasted hard drive prices should stabilize by June of 2012 and the industry will run close to normal in the second half of next year. According to IDC, Thailand accounted for 40% to 45% of worldwide hard drive production in the first half of this year, and about half of that capacity was impacted by floods this November. Intel this week reduced its fourth quarter revenue forecast by $1 billion because the drive shortages will drive down PC sales.
Hitachi Data Systems added a few more lanes to its cloud storage on-ramp today.
HDS brought out the Hitachi Data Ingestor (HDI) caching appliance a year ago, calling it an “on-ramp to the cloud” for use with its Hitachi Content Platform (HCP) object storage system. Today it added content sharing, file restore and NAS migration capabilities to the appliance.
Content sharing lets customers in remote offices share data across a network of HDI systems, as all of the systems can read from a single HCP namespace. File restore lets users retrieve previous versions of files and deleted files, and the NAS migration lets customers move data from NetApp NAS filers and Windows servers to HDI.
These aren’t the first changes HDS has made to HDI since it hit the market. Earlier this year HDS added a virtual appliance and a single node version (the original HDI was only available in clusters) for customers not interested in high availability.
None of these changes are revolutionary, but HDS cloud product marketing manager Tanya Loughlin said the idea is to add features that match the customers’ stages of cloud readiness.
“We have customers bursting at the seams with data, trying to manage all this stuff,” she said. “There is a lot of interest in modernizing the way they deliver IT, whether it’s deployed in a straight definition of a cloud with a consumption-based model or deployed in-house. Customers want to make sure what they buy today is cloud-ready. We’re bringing this to market as a cloud-at-your-own-pace.”
Nasuni is in the business of connecting its customers’ cloud NAS appliances to cloud service providers in a seamless and reliable fashion. So the vendor set out to find which of those service providers work the best.
Starting in April 2009, Nasuni put 16 cloud storage providers through stress tests to determine how they handled performance, availability and scalability in real-world cloud operations.
Only six of the initial 16 showed they are ready to handle the demands of the cloud, Nasuni claims, while some of the others failed a basic APIs functionality test. Amazon Simple Storage Service (S3) and Microsoft Azure were the leaders, with Nirvanix, Rackspace, AT&T Synaptic Storage as a Service and Peer 1 Hosting also putting up passing grades.
“You won’t believe what is out there,” Nasuni CEO Andres Rodriguez said. “Some had awful APIs that made them unworkable. Some had some crazy SOAP-based APIs that were terrible.”
Nasuni did not identify the providers that received failing grades, preferring to focus on those found worthy. Amazon and Microsoft Azure came out as the strongest across the board.
Amazon S3 had the highest availability with only 1.43 outages per month – deemed insignificant in duration – for a 100% uptime score. Azure, Peer 1 and Rackspace all had 99.9% availability.
Rodriguez described availability as the cloud providers’ ability to continue operations and receive reads and writes, even through upgrades. “If you can’t be up 99.9 percent, you shouldn’t be in this business,” he said.
For performance testing, Nasuni looked at how fast providers can write and read files. Their systems were put through multiple, simultaneous threads, varying object sizes and workload types. They were tested on their read and write speed of large (1 MB), medium (128 KB) and small (1 KB) files.
The tests found S3 provided the most consistently fast service for all file types, although Nirvanix was fastest at reading large files and Microsoft Azure wrote all size files fastest.
Nasuni tested scalability by continuously writing small files with many, concurrent threads for several weeks or until it hit 100 million objects. Amazon S3 and Microsoft Azure were also the top performers in these tests. Amazon had zero error rates for reads and writes. Microsoft Azure had a small error rate (0.07%) while reading objects.
The reported stated: “Though Nirvanx was faster than Amazon S3 for large files and Microsoft Azure was slightly faster when it comes to writing files, no other vendor posted the kind of consistently fast service level across all file types as did Amazon S3. It had the fewest outages and best uptime and was the only CSP to post a 0.0 percent error rate in both reading and writing objects.”
Anobit might be the next solid-state storage vendor to get scooped up. Israeli publication Calcalist reports that Apple is in “advanced negotiations” to acquire Anobit for around $400 million to $500 million.
Apple uses Anobit’s mobile flash chip in the iPhone, iPad and MacBook Air laptop. Anobit also sells an enterprise-grade multi-level cell (MLC) flash controller, the Genesis SSD. Anobit launched the second-generation Genesis product in September. The startup claims its proprietary Memory Signal Processing (MSP) controllers can boost endurance levels so that consumer-grade MLC can be used in the enterprise.
While Apple would likely focus on Anobit’s mobile flash controller, it might also use Anobit’s enterprise flash to enter that competitive market. The acquisition continues a trend of consolidation in the SSD market this year. SanDisk acquired Pliant in May for $327 million to move into the enterprise SSD market, and LSI bought flash controller startup SandForce for $322 million in October.
“We believe this yet again highlights the importance of controller technology in the SSD market,” Stifel Nicolaus Equity Research analyst Aaron Rakers wrote in a note to clients today. “While it appears that a potential acquisition of Anobit … would likely leave investors primarily focused on Apple’s ability to leverage the MSP controller technology across its product portfolio, we believe Anobit’s enterprise-class controller capabilities must also be considered/watched with regard to competition against Fusion-io (albeit there have yet to be any signs of Anobit playing in the PCIe SSD market).”
Anobit, which came out of stealth in 2010 with its first Genesis product, has raised $76 million in funding.
I’ve attended two conferences recently where a speaker talked about storage efficiency and the growing capacity demand problem. The speaker said that a part of the problem is we don’t throw data away. That blunt statement suggests that we should throw data away. Unfortunately, that was the end of the discussion and the rest was promotion of a product.
This really begs the question, “Why don’t we delete data when we don’t need it anymore?” When I asked this question to IT people, they had several reasons for keeping data.
Government regulation was the most common reason. Many of these regulations are in regard to email and associated with corporate accountability. People in vertical markets such as bio-pharmaceuticals and healthcare have extra industry-specific retention requirements.
Business policy was another top reason for not deleting information. There were three underlying reasons for this category. In some cases, the corporate counsel had not examined the information being retained and had issued orders to keep everything until a policy was developed. Others keep data because their executives feel the information represents business or organization records with future value. (It was not really clear what this meant.). In other cases, IT staff was operating off a policy written when records were still primarily on paper and had not received new direction for digital retention.
Another common response was that IT staff had no time to manage the data and make retention decisions or to involve other groups within the organization. In this case, it is simpler to keep data rather than make decisions and take on the task of implementing a policy.
The other reason was probably more of a personal response – some people are pack rats for data and keep everything. I call this data hoarding.
Rather than only listing the problems, the discussion about data retention should always include ways to address the situation. Data retention really is a project. To be done effectively, it usually requires outside assistance and the purchase of software tools. In every case, an initiative must be undertaken. This includes calculating ROI based on the payback in capacity made available and reduced data protection costs. The project requires someone from IT to:
• Understand government regulations. Most are specific about the type of data and circumstances, and almost all of the regulations have a specific time limit or condition for when the data can be deleted.
• Examine the current business policies and update them with current information from executives and corporate counsel. Present the costs of retaining the data along with the magnitude and growth demands as part of the need to review the business policies.
• Add system tools to examine data, move it based on value or time, and delete it when thresholds or conditions are met.
• Get a grip. Data hoarding is costing money and making a mess. The person who replaces the data hoarder has to clean it up.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).