820 pts.
 How could automated failover ever be justified?
Start with the fact that the least reliable server has only a 2% chance of failing during its reasonable life time of 4 years. If I understand the argument for MS’s failover facility I can ensure that 2% probability can be reduced [strong]by an unknown percentage[/strong] if I am willing to have a standby backup server. But the backup server is going to cost me as much as the primary server, same hardware cost, software cost, cooling cost, power cost, footprint cost, maintenance cost, administration cost. Now apparently this total cost of backup can be justified because of the importance of what the primary server is doing, but of course we have no way of really computing what the dollar cost of that 2% chance (which means there is a 98% chance the backup will never have to be used) that the primary will ever fail. [strong]And more importantly[/strong] we recognize that backup server can’t protect the primary server if the power in the data center fails, or if the wrong cable is removed by accident, or if the primary system is the victim of a denial of service attack, or if operations does something really dumb, or if MS decided that the software that controls this failover system needs to be updated or patched which would require a reboot which would occur on both the primary and the backup at the same time because the whole idea of the back up is that it is identical to the primary, or if any of the software running on the primary or backup systems need to be updated or patched and requires a reboot, or if some tech type in IT decides he hates his job or his boss and decides to stop both systems to prove how much IT will miss that tech type after he quits. Jim4522

Software/Hardware used:
ASKED: April 7, 2010  10:33 AM
UPDATED: April 7, 2010  6:23 PM

Answer Wiki:
Typically when you are setting up an Active/Passive cluster, the only software license cost you have is for the Windows OS. So for example if it is a SQL Server, you only need to license one copy of SQL Server. When patching is done correctly both nodes of the cluster shouldn't ever go down at the same time as you should patch each one separately and not patch another node until the first node comes back online. ************** An additional point to factor in is the loss of productivity while everyone works to get one server restored to working condition verses having the other server take over and losing no productivity while the first server is brought back to service. That is what typically ultimately decides whether it is worth it.
Last Wiki Answer Submitted:  April 7, 2010  5:58 pm  by  Denny Cherry   64,550 pts.
All Answer Wiki Contributors:  Denny Cherry   64,550 pts.
To see all answers submitted to the Answer Wiki: View Answer History.


Discuss This Question:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _


 

Technochic, I understand the value of getting the application up quickly and not waiting for the failed server to be fixed, but doesn’t that happen both when I failover to an inactive standby server, or if I failover to an active server in a cluster?

It seems to me that if I am using a standby backup server that is accomplishing nothing unless the primary server fails and my primary server has only a 2% chance of failing in its 4 year use life, and if I use this type of backup for a large number of servers then I am paying to have a lot of backup standby servers that may never do anything in 4 years. Jim 4522

 820 pts.