April 19, 2011 10:35 AM
Posted by: Ian Lock
backup data,
dmz,
firewall,
ian lock,
shared storage,
storage securityBy Ian Lock, GlassHouse Technologies (UK), storage & backup service director
Recently I have been asked by several clients about the security of shared storage and backup environments, and in particular whether any element of their storage infrastructure should be shared between internal production and external DMZ servers.
The general consensus for many years for most of my clients has been a definite ‘no’ to this question; the only link between external and internal networks should be a firewall and nothing else. Such rules are normally written in stone and policed by the security team with draconian penalties for anyone who dares to disobey.
I have up to now agreed wholehearted with these rules; they’re there for a very good reason, right? They limit the risk of nasty things or people getting to your production data from the outside.
However, during the course of recent conversations I began to wonder if there wasn’t an argument for some carefully managed sharing of storage resources?
The question seems to have started to crop up a lot more frequently as storage arrays become more and more ‘unified’ and servers become more and more ‘virtualised’.
Companies have realised the benefits of consolidating and virtualising previously separate physical systems to drive down costs, so it goes against the grain to keep discrete storage arrays for production and DMZ.
Most centralised backups systems are, after all, allowed to protect servers in the DMZ, as long as the backup data passes through the firewall. And many clients allow virtual machines residing on the same physical hosts to be provisioned for both production and DMZ use.
As long as all storage management interfaces and software tools are kept carefully locked down inside a secure internal VLAN, what are the actual risks of presenting a LUN to DMZ and production hosts from the same array?
Perhaps the answer is to allow sharing of storage resources, but only with better end-to-end security, including tighter intrusion detection systems and maybe encryption of data at rest embedded into storage arrays. That way you get the best of both worlds.
April 5, 2011 9:34 AM
Posted by: Ian Lock
backup exec,
data deduplication,
ian lock,
storage management,
storage utilisation,
symantec netbackupBy Ian Lock, GlassHouse Technologies (UK), storage & backup service director
Two of my clients are having differing experiences with deduplicating backup appliances they have purchased in the last year. Neither is having issues with the technology, but rather with their storage management processes.
Client A purchased a pair of appliances just under a year ago, having gone through a well-defined sizing, design and proof-of-concept process to ensure not only that the solution would function as planned, but also that they had a good handle on likely deduplication ratios in their environment.
Client A’s environment is based on Symantec NetBackup and backs up a growing estate of virtualised Windows and Linux servers. The new appliance-based backup environment is performing excellently, with great stability, a tiny failure percentage and automatic replication of backups to the DR site. After a year in operation, Client A’s appliances are running at 50% capacity and, as they have visibility of new projects requiring backup capacity over the next two to three months, they are already planning the purchase of fresh capacity to ensure they don’t run out and crash the backup environment.
Client B purchased a similar, albeit smaller appliance a year ago, based on sizing figures originally produced a year earlier and put theirs straight into a production Backup Exec environment. The new backup system has also been a great success with vastly improved success rates and the amount of time spent fixing backups drastically reduced.
However, Client B has migrated backup clients from legacy systems and added new clients with minimal planning or control. Annual data growth was not factored into their initial sizing calculations. As a result they now stand at 99% capacity utilisation with month-end backups approaching. Their only options are to make an emergency purchase, delete valid backups or stop backups altogether – not a good place to be.
Deleting valid backups may not even immediately help – the backup data stored in the appliance is deduplicated and so by definition is chopped up into small chunks, many of which will likely be shared between several different backup images. NetWorker blogger Preston de Guise makes the point very well:
http://nsrd.info/blog/2011/02/09/deduplication-and-space-management/
You’d rather be in Client A’s shoes wouldn’t you? So make sure you keep a very careful eye on the capacity usage of your backup de-duplication target, and make your plans early.
April 5, 2011 9:26 AM
Posted by: David Boyd
backup,
David Boyd,
disaster recovery,
recovery point objective,
recovery time objective,
storage arrayBy David Boyd, GlassHouse Technologies (UK), principal consultant
Disasters occur. Water pipes burst, roofs leak, electricity supplies fail, network paths get dug up, and much worse. Fortunately, most IT managers will never get to see their disaster recovery plans put into practice for real. But some will and all too often it is only at that point that their shortcomings are realised.
Once the dust has settled and normal service has resumed IT managers will be under pressure to fix what went wrong with the original plan. Knee-jerk reactions and over-engineered solutions are too often implemented when a measured approach may be more efficient and much less costly.
I have a customer who is in exactly this situation. A water leak resulted in the failure of the entire storage environment. Following the disaster our customer rapidly acquired a new storage array and set about configuring it and allocating storage back to the clients.
Fortunately, the backup environment was unaffected and all services were resumed within a couple of weeks. It could have been much worse. If the disaster had happened at month end or if their hardware suppliers had not been able to mobilise so quickly the impact of the disaster could have been catastrophic.
Valuable lessons can be learned from a disaster. As a result of this exercise my customer has a detailed knowledge of the relative importance of each application. They know exactly what components link together to provide service; in my experience something that a lot of organisations don’t have a handle on. They also know exactly how long it takes to recover a service – including build, restore and configuration times. Despite being armed with that information, senior management have decreed that going forward a recovery time objective (RTO) of 24 hours be implemented for all services, including test and development.
There is no doubt that the outage caused severe pain to the organisation but reacting by stipulating a single recovery tier across the organisation is going to cost a lot of money. Replacement hardware can rarely be sourced within such a timeframe and therefore an exact replica of the environment needs to be purchased and will only be used in the event of a disaster.
A better way would be to take the hard lessons learned and align applications to defined disaster recovery service offerings. If you have an understanding of your disaster recovery requirements then you can start building a service catalogue which reflects that. Create distinct service tiers, based upon the RTO and RPO (recovery point objective) and align a technology configuration to those tiers.
For example, the “platinum” tier might include application level clustering and synchronous data replication whereas your “bronze” tier would involve data recovery from tape after the procurement of replacement hardware. In this way, applications that have little impact on the organisation’s daily activities can be assigned to a lower tier which gives sufficient time to procure replacement hardware, negating the need to have everything duplicated.
Once the service catalogue is in place, solutions that meet those requirements can be investigated and purchased. Of course, a full schedule of DR testing is a vital part of any solution.
While taking longer, this approach will help to prevent the knee-jerk reactions that often follow disasters.