820 pts.
 Choosing between failover options
It seems of the two choices, failover to an active server within a cluster, or failover to an inactive backup standby server, the economic chose would be the former. With the former, the servers in the cluster are all doing productive work while they also act as backup for the other servers in the cluster. With the latter you are accepting that the inactive server produces no value other then acting as backup. Since there are IT organizations choosing the latter it must be for one of two reasons. You know the inactive server has the available capacity to accept the workload of the failing server, and I suspect the failover may be quicker since the controlling software doesn’t have to search for an active server that can take the additional workload. Is that the reason why an IT organization would choose the less economical choice of failover to an inactive standby backup? Jim4522

Software/Hardware used:
ASKED: April 8, 2010  11:40 AM
UPDATED: April 9, 2010  9:23 AM

Answer Wiki:
1-SQL Server Cluster is assigned a virtual server name and an IP address which is used by the applications to connect to the SQL Server. There is no change required on the application side as the failover node acquire the same virtual server name and an IP address in case the active node goes down. Whereas the Standby server possess different server name and the IP address which means either an application change or a DNS change will be required in case the Standby server is promoted to a primary role. 2-SQL Server Clustering provides high availability by protecting against a node failure. It is important to understand here is that the storage failure will result into an application disruption as all the nodes in the cluster uses shared storage which also contains database files of the SQL Server database. Whereas Standby server/database which is normally installed on the other independent SQL Server protects not only against the operating system or a hardware failure but also against storage failure as it is installed on a separate storage. 3-In case of a planned upgrade of an operating system or a SQL Server, clustering has the advantage, as it's relatively easy to configure one failover cluster to fail over to any other node in the failover cluster configuration. This way, system downtime can be minimized thus providing high server availability. On the other hand if the Standby server/database is promoted to primary server/database then switching back the roles is relatively a complex task. This involves executing some stored procedures and also make sure that the transaction logs of the database are not truncated which will break the log shipping sequence and hence the Standby process will need to be start all over again which will require a complete backup and restore of the database. 4-SQL Server cluster has high requirements in terms of hardware and software as opposed to Standby. Cluster requires Windows NT Enterprise Edition, Windows 2000 Advanced Server or Windows 2003 Enterprise Edition and SQL Server Enterprise Edition (for SQL 2000, SQL Server 2005+ supports Clustering under SQL Standard) along with Microsoft Certified hardware (for clustering under Windows 2003 and below, clustering on Windows 2008 only requires that the cluster pass the validation process) whereas Standby can be configured using Log Shipping (provided in Microsoft Enterprise Edition for SQL 2000, or Standard Edition for SQL 2005 and up) or using third party vendor software and using custom scripts in Standard Edition or through the third part software. Standby Server doesn't have any special hardware requirements. 5-Setting up a SQL Server Cluster is a complex process and requires expertise to configure and maintained. Before setting up a SQL Server, Windows Cluster needs to be configured which requires a shared storage, the setup of which itself is a complex task. Standby has no such requirements. (Not really, clustering isn't that hard to put together.) 6-SQL Server cluster requires high speed LAN. This is required for nodes in a cluster, which need to send and receive what is called a heartbeat signal, among other communications. This signal is used by each node to determine if the other node is still available. In case any node is not available then the other node takes over. On the other hand Standby can work either on a local area network or over the WAN. Of course depending on the size of the database it can be a slow process. (Clustering will work fine on a 10/100 LAN) 7-Cost wise SQL Server Cluster is an expensive solution compare to Standby as it only supports hardware listed on the Microsoft Hardware Compatibility list and requires Enterprise Edition of Windows and SQL Server. (Only when clustering under Windows 2003. Windows 2008 only needs to pass the validation wizard.) 8-In conclusion, both SQL Clustering & Standby have their own advantages but which solution to implement has to be decided based on requirements, resources and budget. Ideally if the resources and budget are available then both options can be implemented where Clustering can provide convenience for planned operating system and SQL Server upgrades and Standby can provide the protection against the primary server/database crash. So, it's not just 2 words to tell ;) & you still have to decide.
Last Wiki Answer Submitted:  April 9, 2010  7:23 am  by  ITArts   160 pts.
All Answer Wiki Contributors:  ITArts   160 pts.
To see all answers submitted to the Answer Wiki: View Answer History.


Discuss This Question:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _


 

What service are you trying to cluster?

 64,520 pts.

 

Mrdenny, in response to your question “What service are you trying to cluster?” I have recently completed a tour of a number of large IT organizations primarily in the financial industry, in each I found management completely committed to mastering the ability to survive any failure that could occur on their huge server farms, to the exclusion of showing any interest in reducing the number of failures that occur on those servers. For example, none of those organizations seemed interested in acquiring more reliable servers because they knew that they could survive any failure that could occur on a less reliable server. Their theory seemed to be that a failure survived is the equivalent of a failure prevented. As a former CIO of one of the largest banks in the US I found that theory disturbing. So I decided I wanted to know more about IT’s ability to survive failures, which then brought me to the subject of failovers and whether a failover was really as good a solution to failure as not having the failure occur in the first place. Each of the CIO’s I talked to seemed to discount the importance of server reliability for two separate reasons. One, they thought that any failure that could occur on a server could be overcome by failing over to another server. Two, they thought that hardware failure was only a minor part of the failures that servers could experience so improving the reliability of the server would not significantly reduce the total failures on servers. Here is the dilemma I struggle with. If it is true that hardware failures are a small part of all failures that can affect servers then how does failover solve the problem since failover only seem to resolve hardware failures. Failover doesn’t seem to address, power problems, excessive heat problems, firmware problems, application problems, operator problems, Operating System problems, RAID storage problems, memory leakage problems, security problems and so on. If an operator turned a server off inadvertently it would seem that failover would substitute another server to pickup the work of the turned off server but who stops the operator from turning off the server?

 820 pts.

 

Mrdenny, I can understand why failover can increase availability by quickly replacing an unavailable server, but does that mean that a server failure survived, is the equivalent of a server failure that could have been prevented? Jim4522

 820 pts.