Companies tend to focus on the positive aspects of using SATA disk drives for a growing portion of their enterprise storage needs but as some companies are finding out, managing thousands or tens of thousands of SATA disk drives can take on a life of its own.
Recently, I spoke to Lawrence Livermore National Laboratories (LLNL) which is a huge DataDirect Networks user. By huge, I mean they use multiple DataDirect Network Storage Systems with the total number of SATA disk drives in production numbering in the tens of thousands, possibly even up to a hundred thousand SATA disk drives. More impressive, LLNL uses these storage systems in conjunction with some of the world’s fastest supercomputers, including the BlueGene/L currently rated #1 among the world’s fastest computers.
The issue that crops up when companies own tens of thousands of disk drives — SATA or FC — is the growing task of managing failed disk drives. Companies such as Nexsan Technologies report failure rates of less than half of 1% of all SATA disk drives that they have deployed out in the field. Those numbers sound impressive until one begins to encounter environments like LLNL that may have up to a hundred thousand SATA disk drives in their environment. Using a .005% failure rate in that scenario, companies can statistically expect a SATA disk drive to fail about every other day, which is inline with LLNL’s experience.
This is in no way intended to reflect negatively on DataDirect Networks. If users were to deploy a similar numbers of disk drives from any other SATA storage system provider, be it Excel Meridian, Nexsan Technologies or Winchester Systems, they could expect similar SATA disk drive failure rates.
The cautionary note for users here is twofold. First, be sure your disk management practices keep up with your growth in disk drives. Replacing a disk drive may not sound like a big deal, but consider what is involved with a disk drive replacement:
- Discovering the disk drive failure
- Contacting and scheduling time for the vendor to replace the disk drive
- Monitoring the rebuild of the spare disk drive
- Determining if there is application impact during the disk drive rebuild
- Physically changing out the disk drive
Assuming a .005% failure rate, companies with hundreds of disk drives will repeat this process once a year, those with thousands of disk drives once a quarter and those with tens of thousands once a week. Once a company crosses the 10,000 threshold barrier, companies need to seriously contemplate dedicating a person at least a part-time just to monitor and manage the task of disk drive replacements regardless of which vendor’s storage system one selects.
The other cautionary note is that the more disk drives one deploys, the more likely it becomes that two or even three disk drives in the same RAID group will fail before a recovery of an existing failed disk drive is complete. Companies, now more than ever, need to ensure they are using RAID-6 for their SATA disk drive array groups and, when crossing the 10,000 disk drive threshold, should consider the new generation of SATA storage systems from companies such as DataDirect Networks and NEC. These systems give companies more data protection and recovery options for their SATA disk drives.
The Storage Networking Industry Association (SNIA) released the results of a survey of its members on long-term archiving, which indicated that yes, there is an archiving problem in the industry: of 276 “long-term archive practitioners” who responded to the survey, 80 percent said they have information they must keep over 50 years, and 68 percent said they must keep this data more than 100 years. 30 percent said they were migrating information at regular intervals. Around 40 percent of respondents are keeping e-mail records over 10 years. 70 percent said they are ‘highly dissatisfied’ with their ability to read their retained information in 50 years.
One caveat is that the survey, which was sent to more than 10,000 members of the SNIA and associated organizations, was “self-selecting”, and though 80 percent sounds like a high number, the number of SNIA members concerned enough about long-term archiving to take a half hour to complete the survey was a much smaller percentage.
Perhaps more interesting were the vertical markets where the survey drew the most responses; according to SNIA they include education, manufacturing and chemical processing, power and energy companies and banking in addition to the obvious ones like libraries and government.
In response, SNIA’s 100-Year Archive Task Force (a name which conjures up an image of SNIA Ninjas crashing in through the window…) will be creating an archiving-infrastructure reference model and adding an archiving information field based on the Open Archiving Information System standard into its own standards including SMI-S and XAM.
Speaking of which, XAM proof of concept demos will be taking place at fall SNW, based on version 0.6 of the standard. This doesn’t jibe with what SNIA told us last October about the timing for the standard in a story for Storage Magazine–back then, those demos were supposed to be happening in San Diego this past April. Now, SNIA says that at the time of that story’s publication, the XAM Initiative hadn’t been officially launched. This, according to a SNIA spokesperson, has now officially, formally occurred as of Spring SNW 2007, and since that announcement, at least, the timing on proof of concept demos hasn’t changed. Hence, “there has been no delay” in XAM’s progress. (Clears things right up, doesn’t it?)
At any rate, whenever it actually gets here, XAM, which stands for Extensible Access Method Interface, would define a single access method for archiving devices like EMC’s Centera, Hewlett-Packard’s (HP) Reference Information Storage System (RISS) and the HDS HCAP product. The SNIA’s also calling for “self-healing systems” available today to be modified for use in archiving and storage, automating the migration process between old and new data formats.
We don’t necessarily need the SNIA to tell us that archiving’s a big deal right now, and they’re certainly not alone in pointing out that, hey, all this technology’s great, but even a cave drawing kicks its butt when it comes to readability longer than a few months after its creation. However, there’s one aspect of long-term data preservation that isn’t addressed by XAM and other software-based standards: storage media itself.
SNIA reps say software’s a bigger concern now with new optical media that’s claiming a 30 to 50 year shelf life, but other experts feel that most storage media is further behind the long-term viability curve than that. From my point of view, I can’t help but wonder why more attention isn’t being paid to the development of new storage media in this industry, given the double-whammy of power and cooling and long-term preservation issues that have lately become hot topics. After all, even if every vendor adopted the XAM standard tomorrow, without better long-term media, there won’t be any data available for it to access in a couple of decades, anyway.
Greg Reyes, the former CEO of Brocade, was found guilty of securities fraud yesterday in the first criminal case involving the backdating of stock options.
For IT users, the result should offer some peace of mind that when you spend millions of dollars with a company, it is not going to get away with being so dysfunctional as to end up like Enron, leaving its customers and employees out on the street.
The conviction is also a signal to the rest of the industry that the buck stops with the CEO, no matter how busy you are. Reyes’ defense (if you believe it) was that he was too busy to know everything going on at the company and that others were responsible. Clearly the Judge didn’t buy it.
Reyes’ lawyer also argued that Reyes did not benefit personally from backdating options, which is absurd. A CEO’s compensation is inextricably linked to the financial success of the company on Wall Street. Hence the SEC’s unanimous vote last year in favor of tighter regulations around executive compensation.
Reyes is, of course, appealing the verdict, but it seems unlikely that the courts will take much notice. His sentencing is scheduled for Nov. 21 and he could get as much as 20 years in jail. It’s more likely he’ll serve a fraction of this time, but will face hefty fines.
If it was up to me, CEOs convicted of this type of offence would be restricted from ever running a public company again. I personally had several meeting with Reyes in the course of covering the storage industry, and he commonly gave the impression of being beyond reproach. He once abruptly ended a meeting as he was “too busy” and didn’t like my questions. I don’t get the impression from news of his trial that any of that has changed.
As a journalist, it’s been interesting to see what went on in the HBA market today. Yesterday, Emulex was briefing press about its announcement of 8 Gbps Fibre Channel components scheduled for today. This morning when I checked the wires, however, there was a very similar announcement from QLogic, by whom I had not been briefed.
Turns out they beat Emulex to getting the press release on the wires by a couple of hours. Today Emulex officials are trying to figure out how they got “scooped” on the announcement; it’s a common occurrence among competitors in my line of work but relatively unusual for IT companies.
In the long run it’s not a big deal, and it’s probably not going to substantially affect either company, except probably to tweak Emulex a little and give some QLogic execs a chuckle today.
Bottom line is, the release of these components is at this juncture a fairly moot point for end users, since 8 Gbps FC HBAs are only really going to come into play when there are 8 Gbps FC SAN switches, which isn’t slated to happen until next year. Both of these components in turn also would need 8 Gbps FC SAN storage systems for 8 Gbps performance, and that’s also at least another six months away, and probably longer.
Emulex is releasing these products in test quantities to system vendors for testing. The HBAs are also available directly to end users who want to be extra, ultra-prepared for the advent of 8 Gbps FC, since they are backward-compatible with 4 Gbps and 2 Gbps FC systems. Emulex also pointed out that users’ technology refreshes don’t always jibe with vendors’ product cycles, so if the vagaries of budgeting dictate outfitting your servers with 8 Gbps HBAs now, now you can go for it.
Linux is currently used in about 20% of the medium to large sized data centers, and according to some reports, it will be in some 33% of data centers before the end of the year. By 2011, it is expected that most data centers will have at least half of their environment running some flavor of the Linux OS. As this platform really begins to settle in, it is important to consider the ramifications that it will have on storage, data protection and disaster recovery.
When I look at how a supplier handles coverage of a platform, I compare it to the games checkers and chess. When a supplier has “checker coverage”, that means they have just enough support of the platform to be able to get a check mark. When I say they have “chess coverage”, that means they have deep coverage, including specific databases that are popular on the platform.
Looking at the foundation of data protection, backup software is a good place to start. Most of the major suppliers certainly have “checkers” type coverage of the Linux environment. Most have Red Hat and maybe Suse variants covered, but some still only support Linux as a client, meaning that the Linux servers cannot have locally attached tape. As your Linux environment grows, this can be a real problem. A handful of the backup software suppliers have also ported over their Oracle hot backup modules, and while Oracle on Linux is significant and growing, the MySQL install base seems to be growing faster. And, while in the past the size of the MySQL data set was not nearly as large as the Oracle data set, it seems to be catching up there as well. A little farther behind is PostgreSQL, but it still has a significant install base and it too seems to be growing. So, it is important that your backup application supports more than just Oracle and can do more than just hot database backup, being granular to the table space level to help with faster backups and recoveries for example.
There are backup applications that support Linux completely, and there is no longer a need to sacrifice. This may mean supporting two backup applications in the enterprise: one for Windows and one for Linux. But, as I have said in past articles while not ideal, that is not unacceptable, especially if it means you significantly improve your level of data protection on the second platform. You may find that your new product provides as good as or even better support than your original one.
When looking at core storage the situation is equally interesting. For block-based storage or SAN storage, basic support or “checker coverage” seems to be there across the board. Most of the SAN vendors support fibre attaching Linux servers to their SAN storage and their growing support for iSCSI connections. There is not much support beyond this basic connectivity though. There is limited support for boot from SAN.
Interestingly, when it comes to SAN-based storage the manufacturers have created modules for specific applications that allows their SAN arrays to better interact with them. For example, they might have a module for Exchange that will quiesce the Exchange environment, take a clean snapshot and then mount that snapshot to a backup server for off-host back up. Despite the increased growth of the Linux install base, and especially the growth of MySQL and PostgreSQL in that environment, we have not seen many specific tools to protect these increasingly critical applications. You can write scripts to accomplish the above, and in many cases now you have to. But, it would be better to have this integrated into the storage solution, so you can avoid all the issues that surround homegrown scripts.
With Linux and NAS based storage, you have to be equally careful. The Linux file system is Unix, so that means working with a Windows Storage Server based NAS can often be problematic. In all fairness, a Linux based NAS often has problems with Windows clients. There are two options here. You could focus on the Tier 1 NAS providers that have the Unix and Windows files system differences mostly resolved. This has challenges in cost, but provides comfort and reliability. Another option is to use a virtualized network file management tool. With a network file management product you can have both a Windows NAS and a Linux NAS and have data directed to the appropriate NAS based on data type, allowing for a seamless support of both file systems. Of course, a network file management product delivers far more than this. For example, it can enable a migration of data as it ages to a disk-based archive or it can help with migration to a new NAS platform all together.
Disaster recovery is another point of consideration. If you are replicating at the SAN level, then the SAN storage controller itself can cover most of this. But if all of your Linux data is not on a SAN then you may have issues with replication of disaster recovery data. With the available replication software applications, you have some very Linux focused applications but not many that can cover the enterprise. Replication is an area where you don’t want to have too many different tools to monitor and manage. Focus on finding a solid multi-platform tool than can replicate Linux, Unix and Windows data.
Linux is going to be increasingly important in enterprises of all sizes and it seems that the traditional market leaders in storage are going to ignore the platform or give it just “checker” type of coverage. The new players on the market are taking advantage of this and are moving quickly to fill the void. It is interesting to note that most of the manufacturers that have a strong Linux solution also have an equally strong Windows and Unix solution. So, in only providing the very basic of support for Linux the market leaders may end up ceding the entire enterprise.
For more information please email me at firstname.lastname@example.org or visit the Storage Switzerland Web site at: http://web.mac.com/georgeacrump.
Since my blog entry about CA posted yesterday, CA representatives and I have had a number of conversations about what I wrote and what the company has delivered in product functionality. In doing so, both of us have come to the realization that there were some misperceptions and missteps on both of our parts as to what I was asking about, and what products they actually delivered.
In terms of the context for the interview and the update I expected from CA, I was looking for what CA was doing to pull different data protection components together to manage them under one umbrella –whether its own components or those of competitors. Maybe I was unclear in articulating those expectations or they did not understand them – probably some of both.
I don’t for one second believe that integration is a trivial task. In fact, this may be one of the greatest challenges backup software vendors face this decade, and possibly the next, but that is also one of the reasons I am covering it. Archiving, virtual tape libraries (VTLs), compliance, continuous data protection (CDP), synchronous and asynchronous data replication and retention management are now all part of the data protection mix. Frankly, I’d be concerned if CA claimed it had fully integrated all of these components because analysts probably would have had a field day verifying, and likely debunking, that claim.
On the other hand, in conversations I have had with CA’s competitors, on and off the record, I sense that CA is starting to lag behind. There is nothing tangible I can point to, just a sense of the depth and quality of conversations I have had.
That is not to say CA is doing nothing, as yesterday’s blog post could have incorrectly led readers to conclude. In data protection, CA has focused on adding new features and integration to the XOsoft CDP and Message Manager email archiving products. To CA’s credit, they did bring up a good point, that companies still internally manage data protection and records management separately. So they first sought to bring out new features and functions in those products based on internal customer demand before tackling the overarching integration problem.
For example, since I last spoke to CA in February, a second integration service pack was released in March containing two features that I believe administrators will find particularly useful. Through the ARCserve interface, administrators can browse replicated jobs set up in WANSync and see the sources being selected for replication and the target replication servers. Then, when they need to restore jobs backed up from the XOsoft replica, the restore view in ARCserve provides a view of the production servers rather than the XOsoft WANSync Replica server.
The ultimate question remains, “Is CA doing enough and doing it fast enough?” Someone older and wiser than me once told me that it takes about 8 years for changes in storage practice and technology to work their way into the mainstream. Whether that holds true in the rapidly changing space of data protection remains to be seen.
This last week, I had a chance to catch up with CA on what integration has occurred in their Recovery Management product line since I last spoke to them in February. Based upon what they told me in the first interview and the little progress they had made, I spoke to them again to make sure I didn’t miss something.
“Scrambling” is the word that Frank Jablonski, CA’s Product Marketing Director, used to describe CA’s efforts to pull together and offer customers some level of integration between their XOSoft and ArcServe product line. To that end, they have released two service packs to start to integrate these products.
The first service pack enabled ArcServe to use a script to create backups from CA’s CDP product — XOSoft WANSync. The script does the following:
- Periodically stops the replication on XOSoft WANSync.
- Takes a point in time copy of the data on the WANSync server.
- Resumes the replication.
- Backs up that point-in-time copy.
ArcServe then centrally manages that point-in-time backup which companies can use for longer term retention. The second service pack provided that same functionality but added a GUI interface.
However, I know many good system administrators that could write that script in their sleep. That leaves the integration with XOSoft WANSync and ArcServe at little more than a rudimentary level. Though it demonstrates progress, CA needs to accelerate their efforts in light of the announcements that CommVault and Symantec have made in the last couple of months.
To catch up, CA is planning major version upgrades of XOSoft in January 2008 and ArcServe in the spring of 2008. Jablonski promised users will see both product upgrades and more integration across CA’s different data protection products at that time. However, CA will likely not complete their integration efforts until about 2010 or 2011.
CA definitely has the potential and the software to offer users a robust data protection and management package. To CA’s advantage, backup software is generally not a product that users are apt to rip and replace. However, CA is currently trailing some other backup software vendors in their integration and delivery of new features. But, this will likely only begin to matter if I am still making blog entries similar to this one about their product a year from now.
Over the past month, I’ve been working on putting together podcast tips with some of our experts. Pierre Dorion, certified business continuity professional for Mainland Information Systems Inc., recently contributed a podcast called “Outsourcing backup: Get the right service level agreement”.
In this tip, Pierre discusses the questions that can help you ensure that a service level agreement (SLA) meets your requirements when outsourcing backup, such as:
- What are your data recovery needs?
- How fast can your data be restored?
- What are the contractual obligations of the SLA?
- Does your service provider have a solid disaster recovery plan in place?
Pierre also offers practical advice on making sure these questions get addressed. Check it out below.
Elsewhere on the Web, check out www.sla-zone.co.uk. It’s got a bunch of useful SLA information, broken down into topics such as services, performance, problem management, customer requirements, termination, and so on. Also, www.itil-itsm-world.com has a series of documents that are used to help build a framework for service management, including information on service level agreements and IT outsourcing.
It is being widely reported that HP is in “advanced talks” to buy Bull SA, a French IT integrator. The reports, originated by a French website, capital.fr, contain detailed information about the potential price of the deal (approximately $1 billion US) and have been reprinted by sources including CNNMoney.com and Reuters. HP declined comment on the rumors.
Bull, whose major shareholders include the French government, has a storage business unit, though it’s mostly a channel/storage integration play. The company also deals in other IT products, including servers and networking equipment, and has a customer footprint mostly in French government as well as a few overseas state and local government agencies, according to storage industry analysts.
Otherwise, analysts said, they’re mystified at the potential merger. “HP would have an interesting job on its hands getting a sleepy company to wake up,” said Arun Taneja, founder and analyst for the Taneja Group.
The company underwent a restructuring at the turn of the millenium, refocusing itself on channel sales and systems integration. However, despite attempts to penetrate US markets since, 80 percent of its revenues come from Europe, with a full 40 percent from France alone. Prior to its restructuring, Bull had gotten into the business when it purchased Honeywell, a mainframe and minicomputer manufacturer that was ultimately left behind by the advent of the PC. (It’s a story similar to Digital Equipment and Wang, which went the way of the dinosaur when they couldn’t compete with IBM, Sun et al).
“I can’t see what’s in [a potential acquisition] for HP other than the acquisition of a customer base for servers and storage,” said John Webster, principal IT advisor with Illuminata, Inc.
Following EMC Corp.’s storage announcements last week, which included the introduction of a new Symmetrix array, the industry has been buzzing with the claims and counterclaims of EMC and high-end disk array rival Hitachi Data Systems (HDS), as well as debates over the merits of each company’s products.
In the past week, two storage consultants in the UK have dug into the technical specs of Hitachi’s USP and the new Symmetrix DMX-4. Nigel Poulton over at Ruptured Monkey takes a close look at the pros and cons of Hitachi’s external virtualization vs. EMC’s internal tiered storage. Meanwhile, storage consultant Chris M. Evans discusses the “green” claims being made by both vendors in their recent array announcements.
Nigel concludes that there are pros and cons to both the HDS and EMC approaches, depending on a user’s particular environment, which leads him to ask a very pertinent question:
There is certainly a demand for both [approaches to tiered storage]…When compared to something like Thin Provisioning, which both vendors are working on, implementing the above features would be a comparative walk in the park.
So if it’s not that hard to implement, and by doing so you potentially hang on to your customers, why not pinch your nose and take the plunge?
Too much Kool-Aid might be the answer.
As for Evans, his conclusion is that “neither vendor can really claim their product to be ‘green’.” HDS’s USP, he concludes, still has a higher power cost per-drive than EMC’s Symmetrix. However, he doesn’t gloss over the weakness of using higher-capacity drives (to which every systems vendor has the same access) to make a “green” claim, saying, “customers choosing to put some SATA drives into an array…[will] see only modest incremental power savings.”
Evans is not the first to bring up the need for big vendors to step up their efforts further around power consumption, particularly when mushrooming data retention and compliance archiving requirements mean data management strategies for reducing storage growth are losing their effectiveness. Users at this year’s Storage Networking World conference in San Diego also called on storage vendors to invest in better silicon rather than pushing the issue back onto users and, in essence, blaming them for their storage management practices. Elsewhere, server and PC makers have already begun moving to more efficient power designs within systems, and users, like Evans, are looking for a similar committment from storage manufacturers to built-in reductions in power consumption–rather than lip service about the latest SATA drives.