820 pts.
 Failover questions for servers
Is the failover of a server in a server cluster immediate?

Is there reason to believe that that a failover in a virtualized cluster is faster than a failover in a non-virtualized cluster?

What role does provisioning play in the speed of a failover?

Is there a limit to the size of a server cluster?

Within a server cluster is it always assumed that there is at least one redundant standby server or can all the servers in a cluster be active at the same time?



Software/Hardware used:
ASKED: April 1, 2010  2:26 PM
UPDATED: April 2, 2010  8:24 PM

Answer Wiki:
Please provide more details, what is your OS? What kind of cluster did you build? Is it active/passive, active/active? How many nodes? How many shared resources? Typically failovers of Microsoft Server 2000/2003 Clusters happen rather rapidly, but it really depends on what it is you are clustering, our SQL clusters tend to take longer than file system clusters for instance. Please provide more details. Thanks!
Last Wiki Answer Submitted:  April 1, 2010  4:17 pm  by  Technochic   56,975 pts.
All Answer Wiki Contributors:  Technochic   56,975 pts.
To see all answers submitted to the Answer Wiki: View Answer History.


Discuss This Question:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _


 

Technochic, I am writing a paper on the subject of “server availability versus corporate computing system stability”. Contrasting the IT centric view that a server failure survived is the equivalent of a failure that did not happen, versus a user centric view that a failure that did happen is a failure that did disrupt the stability of the corporate system. This paper grew out of an awareness that as IT’s ability to survive failures gets better IT’s fear of failure diminishes and that leads to comments such as “it makes little difference to IT if we experience a 1,000 server failures a year or 100, we can handle them all” or “To IT the cost of being able to overcome server failures remains the same if we experience a 1,000 failures in a year or 100″.

As the former head of IT for what was then the fourth largest banks in the US, I find it disturbing that IT’s ability to survive failures has diminished IT’s interest in reducing the occurrence of failures. My point is valid if IT can failover every server failure instantly or not, but I suspect that the view that today failovers are a snap may be an over statement of the real situation. My bottom line is that while ensuring that every failure that does occur can be survived is important, it is just as important for IT management to remember that they also have a responsibility to minimize the failures that do occur. Jim White

 820 pts.

 

Well certainly limiting failures would be the end goal. Providing a failover cluster gives you a TEMPRORAY buffer because if there is a serious issue with the server that failed then you are at a single point of failure until that is resolved. The cluster pair gives you breathing room to address the trouble with the failing server without disrupting service to users in the mean time. This is the way we handle these anyway. We are certainly motivated to resolve the issue causing the failover in the first place.

 56,975 pts.

 

Technochic, I am assuming that when a failover takes place the user on the network who was using the failing server at the time it failed is impacted temporarily or does the receiving server know how to reprocess what every action the failing server was doing at the instant it failed. That seems unlikely to me. I can understand how a failover gives the user who was previously on the failed server a new server fully provisioned to work with but does that mean regardless of where the failed server was in processing the user’s work the receiving server can complete that task?

To believe that the standby server could pick up the processing at any point in the work cycle is to believe that the standby server was in fact mirroring the processing of the failed server, is that the case?

I was assuming that a standby server in a cluster of several servers could fill in for any failed server in the cluster and therefore was not mirroring the processor of any one server in the cluster before one of the servers failed. Jim White

 820 pts.

 

Technochic, If a virtualized server in a cluster of several virtualized servers fails does the work on the failed server only go to a standby server or can it be added to the workload of other active servers in the cluster.?

Can virtualized servers and non-virtalized servers share the same cluster? Jim White

 820 pts.