Traditional storage solutions tend to waste space by their very design. RAID groups stacked in a box, then provisioning LUNS. I have seen recent reports that somewhere between 60 – 70% of all disk space purchased, is wasted. Wasted on an old methodology that carves out a “canister” into which you can then put data into. When you provision a 10GB LUN and put 1GB of data into it, you have 9GB’s of wasted space. Then, look at the mere fact that of all the “real” data, only 20% is generally active. Yet most clients purchase expensive 15K disks to ensure performance of what is only 5% of their entire storage environment. No wonder the costs are out of hand.
Real Thin Provisioning (only provided by 3PAR and Compellent) solves the first element of this problem. Automated tiered storage (Compellent only) solves the second. This type of enviornment allows you to create a pool of 15K drives to handle the IOPS you need, while everything else is automatically moved to SATA when it becomes inactive. This type of architecture reduces the overall problem (footprint, power consmption, runaway space, outrageous expenditures.)
Now, you apply deduplication where needed, and you find that your important data can remain on-line indefinitely. (This does not preclude intelligent management and discussion with end users as to what data should be kept and which should be discarded.) When we utilize intelligent tools to fix the underlying problems, then when we talk about methods to restore data, we begin with a clean slate.
Being the provider of backup software that is device agnostic for over 23 years – we’ll even backup to your monitor – I’m a bit concerned by both sides of this story. As Jon Toigo mentions above, the marketing that is occurring currently is designed for one purpose – to either sell more disks or to sell more tapes. The problem is that both industries disguise the facts through claims of improved backup performance for the user in each case; I say that they are both right AND wrong.
First, the myth of deduplication – I believe that even the venerable Mr. Preston has stated that if you only have one copy of data you don’t have a backup. However, deduplication is just that – the creation of ONE copy of given data. And, not only is this one copy of given data for ONE system, it’s actually one copy of given data for potentially THOUSANDS of systems. While I do not try to lessen the impact of thousands of copies of the same file over the life of a system’s backup, I do not believe that one copy of the corporate financial reports from 2005 in a single, locally stored disc array is very smart. A smart compromise must be reached. Additionally, if I’m using incremental backups with tape or disc, deduplication adds nothing as the only files backed up in my incremental backups are the files that have changed – instant deduplication and it cost me nothing extra.
As to performance, in the 95th percentile of environments the network bandwidth is going to be the gating factor (unless we’re just backing up a locally attached volume). Otherwise, LTO-3 and LTO-4 tapes writing at the 120MB/sec or better are quite capable of keeping up with the network stream. The only time disc is a real winner is in environments where the backup software and network infrastructure allow multiple, simultaneous client backups – if the software is writing to disc (vs. virtual tape), then the only limit to the number of concurrent streams coming into the backup server is the network performance; tape and VTL implementations are limited by the number of physical or virtual tape drives available. Also, with software that understands QFA and tape management, even restoring the last file on an LTO-4 tape cartridge will only require 4-5 minutes.
Finally, for long term storage, even high-end server discs are not designed to be used for a period of time and then stored in an unspun state for 15 years. An LTO tape, however, will happily sit in a vault for 30+ years and still give up its data when asked. Plus, you can drop an LTO tape off the back of a FEDEX truck and all the bits will still be there. Try dropping a disc drive and see how long it lasts.
Rodney King said it best – “Why can’t we all just get along?”. A combination of disc (including deduplication) for short term, near-line storage, and tape for long term secure storage and archival offer the best solution for any organization – even a home user. Plus as mentioned above, if you’re backup software supports QFA on tape, even the restore performance gains touted by the disc vendors is just so much smoke.]]>
There are a lot of flawed numbers in this industry, both from the disk community and the tape community. What is needed is for both groups to get beyond their self serving sales and marketing antics and begin working on addressing, in a cooperative way, the problems of backup. A group was started to do just this about five years ago called the Advanced Backup Solutions Initiative, which received a lot of consumer and vendor buy-in. But, it was squashed by the Storage Network Industry Association, which preferred for the money being spent on such an effort to be spent within SNIA instead.]]>
I couldn’t agree more with most of Curtis’ points. There are many deeply flawed assumptions in the report. Nor is it exactly the first time that Clipper has released such flawed material. (I blogged about another one at: http://thebackupblog.typepad.com/thebackupblog/2008/05/7.html). My analysis is less thorough than Curtis’ perhaps, but I didn’t really need thorough to seem huge gaps in the analysts’ logic.
Here is another way of looking at this, just to prove that assumptions are everything. Assume you do a full backup every day for 180 days. Assume 10 TB of disk, and a 2% change rate. Lets assume the data compresses 2:1. With tape, I would need 900 TB of tape. With disk that is deduplicated, I would need 23 TB of disk. That amounts to slightly more than 2 trays of disk. With LTO4, a small SL8500, or two L700s, or a mid-sized 3584 would be required to hold all that tape. You would also need 6 or so LTO4 drives to back it up every night. So, without going through the numbers, I would say there is (intuitively) a pretty big disjoint when an analyst claims that 2 trays of disk and a server is 23 times more expensive than 1000 tapes, 6 drives, and a library that would be 4 racks long.
But here is the assumption we made: full backups every day of everything. You would never do this with tape (it costs too much!) But with deduplicated disk, you require no more capacity to do this than a traditional backup rotation. So you have 180 recovery points rather than 15 (5 incrementals plus 4 weeklies plus 6 monthlies). Not only is the cost analysis flawed, but they deliberately ignore the strengths of disk.]]>