Disaster Recovery – doing it for real

Business/IT alignment
IT Governance
What experience do you have of invoking your Disaster Recovery procedures for real? Like all good IT departments we have a procedure which is trialled each year. The true test, of course, only comes when you have to use it in anger.

Answer Wiki

Thanks. We'll let you know when a new response is added.

Instead of just doing an isolated test on your second site systems, consider actually failing over real production processing to the second site, let it run for a while there, then transfer processing back home.

Things to watch out for. It may take an outage to transfer processing. You may have capacity issues due to divergent equipment or telecomm. You will need to have well planned out “return home” procedures, which also may take an outage to execute.

The advantages are huge though. Like you say, how else will you know? Isolated testing can only prove viability to an extent. Can you ever have all users available to participate in an isolated test so you can prove capacity viability? How else will you prove your return home procedure work? Better question is have you even considered needing return home procedures?

As much as it makes sense, my experience is business units rarely want to accept the inherent risk in doing an actual production failover when there is no disaster. This is a level of maturity usually only found in very large corporations in the financial industries who build hot, remote clustered, HA systems. The rest of us tend to come up short of this level of testing maturity and accept the risk of not doing the real failover tests.

Discuss This Question: 2  Replies

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • Cotcher
    I have had the experience of participating in a real disaster recovery process due to the 1994 Northridge earthquake in California. Fortunately, we had established and had somewhat rehearsed minimal but adequate disaster recovery procedures. Following the recovery, which involved moving the entire corporate headquarters due to structural damage to the HQ buildings, we thoroughly reviewed/revised/updated those procedures and brought in a professional DR/Hazmat consulting group to close the loop. The key element in my mind was the rehearsal(s) process which is always so difficult to sustain due to daily fire-fights.
    0 pointsBadges:
  • BigBob
    Several years ago, with the train crash in Baltimore that cut the main fiber line, we had to switch to an alternate server based system, even though it was a "minimal" access capable system, it worked. We have redundant systems in place with a third system on a separate fiber and physical location that monitors both systems. This way no trials are necessary. The databases are kept in sync for our clients.
    0 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: