“Big data” is a topic that’s getting a lot of ink these days. Even for companies not in the typical big data verticals (media and entertainment, oil and gas, genomics, scientific research, etc.), the accumulation of reference-based data sets is becoming a problem. Tape’s economics and physical density are well established, and most big data use cases involve tape in one way or another. But at the Tape Summit in San Francisco last week, there were two other points made about tape that are worth repeating.
Tape is significantly more reliable than enterprise disk, based on uncorrectable bit errors. According to information presented by Spectra Logic, a collection of 100 disk drives will experience an uncorrectable (hard) error once every 315 hours of use. That’s roughly once every 13 days. By comparison, the same number of LTO tape drives can go 22,000 hours (more than 900 days) without a hard error. The reason for the relatively high number of hard errors on disk is the bit density that disk drives have achieved in efforts to expand capacity over the years (more on that later).
Now this doesn’t mean that every set of 100 disk drives will have a drive rebuild every two weeks, but statistically, it could. The problem with these larger drives is that rebuilds can take days or longer, making them vulnerable to another drive failure in the meantime. And as is the case with archives, data just continues to pile up. Keeping 100 TB of data on spinning disk was always an expensive proposition, but now it looks more risky as well.
The primary method for increasing capacity in data recording is to increase bit density, or store more data in the same amount of physical recording space. According to an IBM report, disk, tape and flash have produced 40% annual increases in this “areal bit density” each of the past six years. But as bits are packed closer together, the probability of read errors increases. In digital recording, what matters are uncorrectable read errors, and error correction code (ECC) techniques keep advancing as well (increases in CPU power are helping this). But at some point, bit density becomes a gating factor to capacity increases.
Tape’s bit area — the amount of space each bit consumes on the recording medium — is actually 200 to 300 times larger than disk or flash. This would be a problem except that tape has an enormous advantage over disk in recording space. The recording surface area available in a tape cartridge is thousands of times larger than the combined area of disk platters in a disk drive.
The point here is that disk drives are space-constrained, and increases in bit density are becoming harder to achieve. The IBM report stated that only tape is in a position to deliver the increases in areal bit density mentioned above. Essentially, disk technology has a lot less headroom for development, and each successive increase in capacity will be harder and harder for disk drives to accomplish. This puts tape in a much better position to fulfill its technology roadmap, which is currently laid out through LTO-8 at a capacity of 32 TB per cartridge. It also means that the bit error differential discussed above will only get worse.
I don’t think a VAR’s job is to pick a technology winner and ignore the other solutions available. A better strategy is to know the options and the pluses and minuses of each so they can accurately present them to the customer. Does this mean that VARs should tell their customers to ditch their disk arrays and convert to tape? Of course not. But they should understand where these two technologies are today and how they’re fulfilling the use cases they’re being chosen for. Tape is the choice from an economics perspective, but as archives grow in size, reliability may also be a factor in the decision.
Follow me on Twitter: EricSSwiss