With the reduction in the per terabyte cost of disk and the advance of data deduplication technologies there has been an increase in speculation about what the future holds for linear tape.
Historically, and perhaps stereotypically, tape has been seen as a cheaper alternative to disk; it consumes no power when not in use and it is easily moved to an offsite location. Disk, on the other hand, has often been considered as more expensive although it requires less administration, is unaffected by slow or small data streams and has much improved restore times, particularly when restoring individual files.
Technological advancement, however, means that we should challenge these long held beliefs. Perhaps the biggest change to backup strategies has come with widespread adoption of data deduplication. Few backup vendors recommend (or support) deduplication to tape, so deduplication remains predominantly a disk technology.
Tape has long been held as the storage medium with the densest footprint, but data deduplication is changing that. For example, take a medium-sized tape library with, say, 600 LTO4 tapes each holding 1.2 TB. You’d expect this library to consume the majority of a rack. But, if this data is deduplicated at a ratio of 20:1 the library could be replaced by just a couple of disk shelves. Since space has become as precious to data centre managers as power and cooling capacity the potential savings of moving from tape to high capacity deduplicated disk cannot be ignored.
One key advantage of tape is the ability to move data to a remote location easily, but again data deduplication is allowing organisations to work smarter. Historically, in all but the largest companies the cost involved in acquiring network infrastructure to allow you to replicate large volumes of backup data between physically remote sites has been prohibitively high.
Now, because optimised deduplication only replicates unique data chunks between storage devices, smaller organisations that could not previously afford it can copy backup data between sites over low bandwidth networks or networks with high latencies.
Indeed, as WAN performance increases and data deduplication technologies mature, smaller organisations may look to send backup data to the cloud, although the fear of long recovery times will still deter many IT managers.
Long term retention of data is a cornerstone of many organisations’ compliance policy and while many studies show disk can be cheaper than tape for short- to medium-term retentions, backup data that requires long term retention (a year upwards) is still far better placed on tape. Over these longer durations tape consumes less power, doesn’t incur an ongoing maintenance cost, and the data stored on it is less likely to become corrupt.
Disk is becoming increasingly attractive as a replacement for tape in the backup arena and some of the long held advantages that tape has had over disk are being challenged. However, tape still has a place in the data centre. Backups with long retention periods and archived data are still based placed on tape and this is unlikely to change. Indeed, the roadmaps provided by tape and tape library vendors suggest that the future of tape in large organisations remains strong – technology investments of this scale would not be happening if tape was doomed.
By Ian Lock, GlassHouse Technologies (UK), storage & backup service director
Recently I have been asked by several clients about the security of shared storage and backup environments, and in particular whether any element of their storage infrastructure should be shared between internal production and external DMZ servers.
The general consensus for many years for most of my clients has been a definite ‘no’ to this question; the only link between external and internal networks should be a firewall and nothing else. Such rules are normally written in stone and policed by the security team with draconian penalties for anyone who dares to disobey.
I have up to now agreed wholehearted with these rules; they’re there for a very good reason, right? They limit the risk of nasty things or people getting to your production data from the outside.
However, during the course of recent conversations I began to wonder if there wasn’t an argument for some carefully managed sharing of storage resources?
The question seems to have started to crop up a lot more frequently as storage arrays become more and more ‘unified’ and servers become more and more ‘virtualised’.
Companies have realised the benefits of consolidating and virtualising previously separate physical systems to drive down costs, so it goes against the grain to keep discrete storage arrays for production and DMZ.
Most centralised backups systems are, after all, allowed to protect servers in the DMZ, as long as the backup data passes through the firewall. And many clients allow virtual machines residing on the same physical hosts to be provisioned for both production and DMZ use.
As long as all storage management interfaces and software tools are kept carefully locked down inside a secure internal VLAN, what are the actual risks of presenting a LUN to DMZ and production hosts from the same array?
Perhaps the answer is to allow sharing of storage resources, but only with better end-to-end security, including tighter intrusion detection systems and maybe encryption of data at rest embedded into storage arrays. That way you get the best of both worlds.
By Ian Lock, GlassHouse Technologies (UK), storage & backup service director
Two of my clients are having differing experiences with deduplicating backup appliances they have purchased in the last year. Neither is having issues with the technology, but rather with their storage management processes.
Client A purchased a pair of appliances just under a year ago, having gone through a well-defined sizing, design and proof-of-concept process to ensure not only that the solution would function as planned, but also that they had a good handle on likely deduplication ratios in their environment.
Client A’s environment is based on Symantec NetBackup and backs up a growing estate of virtualised Windows and Linux servers. The new appliance-based backup environment is performing excellently, with great stability, a tiny failure percentage and automatic replication of backups to the DR site. After a year in operation, Client A’s appliances are running at 50% capacity and, as they have visibility of new projects requiring backup capacity over the next two to three months, they are already planning the purchase of fresh capacity to ensure they don’t run out and crash the backup environment.
Client B purchased a similar, albeit smaller appliance a year ago, based on sizing figures originally produced a year earlier and put theirs straight into a production Backup Exec environment. The new backup system has also been a great success with vastly improved success rates and the amount of time spent fixing backups drastically reduced.
However, Client B has migrated backup clients from legacy systems and added new clients with minimal planning or control. Annual data growth was not factored into their initial sizing calculations. As a result they now stand at 99% capacity utilisation with month-end backups approaching. Their only options are to make an emergency purchase, delete valid backups or stop backups altogether – not a good place to be.
Deleting valid backups may not even immediately help – the backup data stored in the appliance is deduplicated and so by definition is chopped up into small chunks, many of which will likely be shared between several different backup images. NetWorker blogger Preston de Guise makes the point very well:
You’d rather be in Client A’s shoes wouldn’t you? So make sure you keep a very careful eye on the capacity usage of your backup de-duplication target, and make your plans early.
By David Boyd, GlassHouse Technologies (UK), principal consultant
Disasters occur. Water pipes burst, roofs leak, electricity supplies fail, network paths get dug up, and much worse. Fortunately, most IT managers will never get to see their disaster recovery plans put into practice for real. But some will and all too often it is only at that point that their shortcomings are realised.
Once the dust has settled and normal service has resumed IT managers will be under pressure to fix what went wrong with the original plan. Knee-jerk reactions and over-engineered solutions are too often implemented when a measured approach may be more efficient and much less costly.
I have a customer who is in exactly this situation. A water leak resulted in the failure of the entire storage environment. Following the disaster our customer rapidly acquired a new storage array and set about configuring it and allocating storage back to the clients.
Fortunately, the backup environment was unaffected and all services were resumed within a couple of weeks. It could have been much worse. If the disaster had happened at month end or if their hardware suppliers had not been able to mobilise so quickly the impact of the disaster could have been catastrophic.
Valuable lessons can be learned from a disaster. As a result of this exercise my customer has a detailed knowledge of the relative importance of each application. They know exactly what components link together to provide service; in my experience something that a lot of organisations don’t have a handle on. They also know exactly how long it takes to recover a service – including build, restore and configuration times. Despite being armed with that information, senior management have decreed that going forward a recovery time objective (RTO) of 24 hours be implemented for all services, including test and development.
There is no doubt that the outage caused severe pain to the organisation but reacting by stipulating a single recovery tier across the organisation is going to cost a lot of money. Replacement hardware can rarely be sourced within such a timeframe and therefore an exact replica of the environment needs to be purchased and will only be used in the event of a disaster.
A better way would be to take the hard lessons learned and align applications to defined disaster recovery service offerings. If you have an understanding of your disaster recovery requirements then you can start building a service catalogue which reflects that. Create distinct service tiers, based upon the RTO and RPO (recovery point objective) and align a technology configuration to those tiers.
For example, the “platinum” tier might include application level clustering and synchronous data replication whereas your “bronze” tier would involve data recovery from tape after the procurement of replacement hardware. In this way, applications that have little impact on the organisation’s daily activities can be assigned to a lower tier which gives sufficient time to procure replacement hardware, negating the need to have everything duplicated.
Once the service catalogue is in place, solutions that meet those requirements can be investigated and purchased. Of course, a full schedule of DR testing is a vital part of any solution.
While taking longer, this approach will help to prevent the knee-jerk reactions that often follow disasters.