When dealing with a high load replication environment network latency and network traffic can become as issue. There are a few settings you can play with, within the subscription agents themselves such as the –PacketSize, -UseInprocLoader, and –SubscriptionStreams. While working with these settings may or may not help (personally I’ve had very limited success getting them to improve the data flow rate) there are some other techniques you can use.
The best technique to you is to include your stored procedures as articles within the publication. This will allow SQL Server to instead of replicating the entire record from one server to another, it will simply run the stored procedure on the subscriber with the same input settings as it was run with on the publisher. If you have very complex stored procedures, you’ll want to evaluate the increase load on the subscribers CPU against the reduction of network traffic between the distributor and subscriber.
This technique is critical to use to reduce the network traffic if you have a multi-tier replication topology. This is where your subscribers are also publishers to subscribers of their own. This is because of the way that the replication writes data to the subscriber. Replication uses stored procedures on the subscriber to update the values of the record. Assuming a table which looks like this:
CREATE TABLE Employee
(EmployeeId INT IDENTITY(1,1),
The stored procedure which Microsoft uses for replication will look like this.
CREATE PROCEDURE MS_Employee_Update
SET EmployeeId = CASE WHEN @Columns & 1 = 1 THEN @EmployeId ELSE EmployeeId END,
LastName = CASE WHEN @Columns & 2 = 2 THEN @LastName ELSE LastName END,
FirstName = CASE WHEN @Columns & 4 = 4 THEN @FirstName ELSE FirstName END,
DepartmentId = CASE WHEN @Columns & 8 = 8 THEN @DepartmentId ELSE DepartmentId END
WHERE EmployeeId = @KeyField
While this technique works great for a two tier replication topology, we suddenly have a problem when we try to replicate from the second tier to the third tier. On the second tier all the columns have been updated, no matter which columns have actually been updated. This will cause an undo amount of replication traffic between the second and third tier servers as every column must now be replicated instead of just the values which have changed.
When you replicate the stored procedures as well as the tables when the stored procedure is fired to update the data on the publisher, that same stored procedure will be fired on the subscriber thereby reducing the amount of data which needs to be sent from the second tier to the third tier as again only the stored procedure execution is replicated. On very wide tables this reduction of data can reduce your network traffic load up to 70% (or higher depending on the data being changed and the width of the table).
You can verify that SQL Server is indeed using the stored procedures by using SQL Profiler to profile the distribution process on the subscriber. If you see your stored procedure names being fired then it’s using this technique. If you see procedures named by Microsoft, then it is not.
This technique will only work, if your application has no inline SQL. If you use a combination of stored procedures and inline SQL within the application, the data which is changed by stored procedure execution will be replicated using this technique; while the data which is changed by the inline code will be replicated using the traditional technique. If you have dynamic SQL within your stored procedures the SQL Server will not be able to use the replicated procedure to update the data on the subscriber. The dynamic SQL will prevent the SQL Server from recognizing that the stored procedure is what caused the data change, and the data will be replicated using the normal technique.
If you have a very high load on your database, but no specifically massive tables you may want to look into a fully load balanced solution for your database. This solution falls somewhere between your normal replication solution, and a federation solution. It’s not truly a federation as no one table is spread across several servers; every server in the federation holds every record of every table.
When laid out on paper this type of replication looks very similar to the Pyramid Federation technique we talked about earlier. This type of setup is ideal for an OLTP environment. We use transactional replication to move the data from one server to another as quickly as possible (you’ll want a pretty fast distributor to handle this). Because all writes are done to a single server (to prevent any potential identity column issues) this solution requires the most code change at the application layer as all commands which write data to the database must go to one connection string which talks to the publisher, which all the reads go to the load balancer which reads data from the federation of subscribers.
Because the publisher and distributor are both single points of failure in this setup, it’s recommended that they both be clustered so that they can survive a hardware failure. There is no need to cluster the subscribers as they are redundant by the fact that there are several of them behind a load balancer.
This concludes my mini-series on database federation. I hope that you have found it useful. As always questions or comments are welcome in the comments section below. (There’s no alerts when a response it posted, it’s something that I’ve already brought up with the ITKE staff, so check back for a response.)
Another type of database federation is what I call the full replication federation. This is where you place all the dimension tables (sticking with our data warehouse example from last time) on all servers of the federation. In addition to having the dimension tables on all the servers in the federation, we also allow all the users to connect to all the servers in the federation. This effectively creates an Active/Active solution as users should be connecting to the SQL Servers through a load balancer. As the dimensions are going to be read only as far as the users are concerned it doesn’t matter which server they connect to.
I call this the full replication federation as we setup replication on all tables which aren’t our large table which has been federated.
As we are connecting to all the servers, we can’t have our view and table named the same thing. In this case we want to have our actual table and view with different names. I prefer to simply use a different schema to hide the table where I want it. This changes our view to look more like this (using a three server federation).
CREATE VIEW dbo.FACT_Sales AS
I like to put the local database name in the view script, so that the same script can be easily deployed to each server. You can at your discretion remove the local server and database name.
You can now query the Data.FACT_Sales table on all three servers by simply querying the view on the local table.
You may end up with some of the same “interesting” optimizer query plans as when using the Pyramid federation technique, and the same solutions which we discussed in the “The Pyramid Federation ” post will still apply.
I’d like to say thanks to the San Diego SQL Server Users Group for inviting me to speak last night.
I had a great time speaking to the group, and just like last time the questions were all excellent.
The slide deck for last night session on Federated Databases is now available. I believe that it will also be made available on the San Diego SQL Server Users Group website.
For those that missed the session, it is one of the sessions which I’m giving at the SoCal Code Camp on October 25 and 26 at USC in Los Angeles. Based on the time the presentation took last night, I’ll be expanding it a bit to better fill the two hours I’ve allocated for it at the Code Camp. If you are going to attend the Code Camp, be sure to mark the interested check box after you register so that the Code Camp staff knows how large of a room to put all the session in.
There are several techniques which can be used to federate your database. The first of which, which we will be talking about is the Pyramid Federation (I have no idea if that what it’s actually called, but that what I’ve named it). In a pyramid federation we have a single server which holds the bulk of the tables. Then a set of servers sits beneath this server holding the large table which has been spread over the federation. Normally no data is replicated between all these servers, however data can be replicated if this will improve query performance. That’s a decision which you’ll have to make depending on your system design and platform load.
This type of system is great, if you have just a few tables which you need to federate because of their size, and length of time to return queries issued against them.
The basic layout of the system is that we have a single front end server which holds all our other tables. Using a data warehouse as an example, we keep all out dimensions on this front end server. While our large fact tables and then spread of the 3 servers which make up our backend system. While a user could connect to any of the four servers in the system, the only server which all the data will be available from is the front end server which holds the dimensions. The servers don’t need to know that the table is spread out across three physical servers, as they will query a few as they normally would a single table on the system.
In our example our head server will be SQL_Main, and our three back end servers will be SQL0, SQL1 and SQL2. The table which we have spread over the federation is called FACT_Sales, and we have designed it to hold many, many years worth of sales data totaling in the several billions of rows. Each of the SQL0-SQL2 servers will hold 1/3 of the data for the table. We use the MOD (%) function to decide which SQL Server the data is stored on. (We’ll cover this later, I promise.)
On our SQL_Main server we have a view called FACT_Sales. This view will be setup something like this.
CREATE VIEW FACT_Sales
As you can see from the view definition, the view the fairly simple, we simply query the three remote servers for all the data, matching whatever parameters we pass to the view when we call it. When we create the FACT_Sales tables on the SQL0-SQL2 servers an additional column should be created. As we are using the SalesId value (which is populated by our sales system, not the data warehouse) to figure out which server the row should be stored on, we place a SalesMod column on the table. We will also place a constraint on this column so that the table on SQL0 can only have a SalesMod value of 0, and the table on SQL1 can only have a SalesMod value of 1, and the table on SQL2 can only have a SalesMod value of 2. Loading the data can be done in two ways.
1. The first option is to simply bulk load all the data into the FACT_Sales view, and let the SQL Servers figure out where everything needs to go. This technique will work just fine for smaller sets of data. Just make sure to include a column as part of the select from the sales system(s) which has the formula of SalesId%2, this will give you the value of the SalesMod column to split the data between servers.
2. The second option is to split the data into three select statements through our ETL process and load each of the backend servers separately. If you have a larger amount of data to process this may be faster as there is one less server processing the data, and therefore one less network hop to work with. In addition when using the first technique all data must be written through the linked servers to the backend database servers, and linked servers are not the most efficient way to move a large amount of data.
When using this technique to federate your database, you must be very careful with your queries. You may find that if your dimensions are large, and you are using the dimensions to filter your data, you can end up with some extremely inefficient queries. If this happens you may wish to replicate some of the dimensions from the SQL_Main server to the three back end servers, and reference these replicated dimensions in your query. This will make your query much more complex, but if some correctly if can help the SQL optimizer make much more effective decisions. An example query could be using the DIM_DateTime to filter your records.
JOIN DIM_DateTime on FACT_SalesData.DateTimeId = DIM_DateTime.DateTimeId
AND DIM_DateTime.Year = 2006
This could, under some circumstances, cause the SQL Optimizer to make some “interesting” decisions. A more effective query plan could result from a query something like this. Adjusting the indexes of the FACT tables will usually resolve this issue, however in some cases it may not. SQL Profiler will be your best friend when attempting to resolve these issues, as it will allow you to see exactly what commands the SQL Server you are connected to is sending to the remote server.
LEFT OUTER JOIN SQL0.MyDataWarehouse.dbo.DIM_DateTime d0 ON FACT_SalesData.DateTimeId = d0.DateTimeId
AND d0.Year = 2006
AND FACT_SalesData.SalesMod = 0
LEFT OUTER JOIN SQL1.MyDataWarehouse.dbo.DIM_DateTime d1 ON FACT_SalesData.DateTimeId = d1.DateTimeId
AND d1.Year = 2006
AND FACT_SalesData.SalesMod = 1
LEFT OUTER JOIN SQL2.MyDataWarehouse.dbo.DIM_DateTime d2 ON FACT_SalesData.DateTimeId = d2.DateTimeId
AND d2.Year = 2006
AND FACT_Salesdata.SalesMod = 2
By joining to all three servers DateTime dimension, and specifying that it should join to the local servers FACT_SalesData values only SQL Server should restrict the queries to the local server, and return the subset of data that we are looking for. It may however be necessary to manually break up the queries against each server within their own UNION ALL blocks.
FROM SQL0.MyDataWarehouse.dbo.FACT_SalesData FACT_SalesData
LEFT OUTER JOIN SQL0.MyDataWarehouse.dbo.DIM_DateTime DIM_DateTime ON FACT_SalesData.DateTimeId = DIM_DateTime.DateTimeId
AND d0.Year = 2006
FROM SQL1.MyDataWarehouse.dbo.FACT_SalesData FACT_SalesData
LEFT OUTER JOIN SQL1.MyDataWarehouse.dbo.DIM_DateTime DIM_DateTime ON FACT_SalesData.DateTimeId = DIM_DateTime.DateTimeId
AND d0.Year = 2006
FROM SQL2.MyDataWarehouse.dbo.FACT_SalesData FACT_SalesData
LEFT OUTER JOIN SQL2.MyDataWarehouse.dbo.DIM_DateTime DIM_DateTime ON FACT_SalesData.DateTimeId = DIM_DateTime.DateTimeId
AND d0.Year = 2006
Be sure to use the UNION ALL clause, and not the UNION clause so that the head SQL Server doesn’t try and do a distinct on these values. They are all going to be distinct between each server, and the MOD will be different on each one.
Depending on each situation you’ll need to make some decisions on which query technique gives you the best performance based on your specific data layout and dimension size. Different queries in your environment may have different query requirements.
When working with a federated database platform it is especially important to have an experienced query writer writing the bulk of the queries against the database platform, to reduce as much as possible the poor execution plans written against the database engine.
Look for my next post on database federation where we look into another technique for federating your database.
When your database has grown beyond the performance capabilities of a single SQL Server, there are still ways get increase the system performance. This requires using a technique called a Federating the database, this is also known as scaling the database out. When you increase a servers capacity by increasing the CPU count within the SQL Server it is called scaling up the system. When you increase a servers capacity by adding additional servers to the system it is called scaling out the system. By scaling out the system we add additional entire servers to the database creating a database federation. There are a couple of ways which you can create the database federation. The technique that you use will depend on your own system requirements.
A database federation is not a high availability solution. The correct solution to use for a high availability solution would be Microsoft Cluster Service (MSCS) or Database Mirroring (SQL 2005 and up).
There are some potential down sides to federating your database which you need to be aware of in order to make an informed decision.
1. If any server in the federation is taken offline, the entire database system will become unavailable. This is because the way the federation works requires online access to all nodes of the federation. As the database which is being federated is probably an important asset to the company, this risk can be mitigated by using clustering in combination with database federation to provide a high availability solution to build your database federation on top of.
2. Licensing for a database federation is extremely expensive. SQL Server Enterprise Edition must be used, as database federation requires the use of distributed queries, which is an Enterprise Edition only feature. Another reason for Enterprise Edition would be the number of CPUs supported. As the system is apparently CPU bound (which is one of the key reasons to use a federated database) you will want to use SQL Servers which have as many CPUs as possible in them. This would lead you to select servers along the line of the HP DL700 series of servers, of the SUN Fire 4600 series of servers. Use of these massive servers will decrease the number of servers in your federation, thereby increasing the ease of setup of the federation.
3. The design of a database federation is not a simple task. It requires an intimate knowledge of the not only the database, but the entire application platform which works with the database backend. In addition you need to have a solid grasp of not only the current system requirements, but of the far reaching expandability requirements of the database as well. This knowledge is key as changing the design of your database federation is an extremely complex task, which if not done correctly can easily lead to ours of down time, and poor performance while data is moved from one node of the federation to another.
While these are some pretty important things to think about, federating your database has some major upsides as well.
1. By federating your database, you will increase the amount of data that can be loaded into cache, as that data is loaded as each server loads its own data into its own cache. This allows you to go well beyond the 64 Gigs of memory that Windows 2003 or Windows 2008 Enterprise Edition support. With enough servers in the federation this will allow you to go beyond the 2 TB limit of Windows 2008 Data Center Edition.
2. In addition to the additional data cache you have access to, you also get access to more CPUs than you would be able to fit in a single server, unless you where to purchase a very high end system such as an EMC Superdome, or one of the IBM iSeries servers.
3. Given that the data is laid across multiple servers this then increase the number of disk controllers, or HBAs that you have access to, which can increase the available throughput to the disk. It also increases the number of PCI busses which you have access to, thereby preventing any sort of contention as the data crosses through the HBAs or RAID Controllers and through the PCI bus on its way to the CPUs and RAM.
Now that we’ve gone over some of the basics of the federated databases, read through my next few posts as I talk about the various techniques which can be used to federate a database, and we go through the design processes to use each one within your database environment.
I’ve seen an issue with Service Broker that others may be seeing. Messages are sent into the service broker, and are processed as normal, and they are removed from the message queue. However you still see the conversation in the sys.conversation_endpoints DMV in a CONVERSING state instead of a CLOSED state. There are no records in the sys.transmission_queue which is the very strange part. There are also no errors when checking with SQL Server Profiler.
Apparently this is a known issue which they are working on. The strange thing is that when it happens on my system, it only happens on a single queue in my database.
Currently the only workaround is to do an END CONVERSATION WITH CLEANUP on the conversations. I’ve written this script which clears out the conversations. I’ve made it so that it only removes the messages which are for the problem conversation which don’t currently exist in the queue (this queue is not auto processed, there is a service which queries the queue every 30 seconds so there can be a backlog of valid messages in the queue which I don’t want to delete).
declare @i int
set @i = 1
while @i <> 10000
declare @conversation_handle uniqueidentifier
declare cur CURSOR for
SELECT TOP (1000) conversation_handle
WHERE NOT EXISTS (SELECT *
FROM [tcp://AWT/Sonar/Q_ObjectDelete] a
WHERE a.conversation_handle = sys.conversation_endpoints.conversation_handle)
AND sys.conversation_endpoints.far_service = 'tcp://AWT/Sonar/Svc_ObjectDelete'
AND sys.conversation_endpoints.state <> 'CD'
fetch next from cur into @conversation_handle
while @@fetch_status = 0
end conversation @conversation_handle with cleanup
fetch next from cur into @conversation_handle
set @i = @i + 1
I run this every hour to clean up the bogus records in the sys.conversation_endpoints DMV.
Without cleaning up the sys.conversation_endpoints DMV the tempdb will slowly start to fill up and throw out of space messages while sp_spaceused shows that the tempdb is empty in the same that id did in the other post I did a while back.
Part two of my two part series on hardware tuning of your SQL Server has just been released on SearchSQLServer.com. This part is titled “Tuning SQL Server performance via memory and CPU processing“.
This post is specifically about setting up DTC on Windows 2008. If you are looking for the post about setting up DTC on Windows 2003 you’ll want to go to the post “How to configure DTC on Windows 2003“.
The basic idea behind DTC setup in Windows 2008 is very similar to Windows 2003. DTC needs to be installed and then configured. To install DTC open the Control Panel, then Programs and Features. Click on the link to the right which says . Eventually the Server Manager will open and finish querying the system for the list of features and roles which are setup. Click on Add Feature button and select Application Server from the Feature list (you may need to click next to get past the welcome to the wizard screen. On the screen which tells you about the Application Server click next. On the next page you can select the services which you be installed as part of this role. Select the Incoming and/or Outgoing Remote Transactions depending on which one you need. If SOAP will be used you may need the WS-Atomic Transactions installed. If so select that as well. (In this example I’ll check them all.)
Then click next. If you have selected the WS-Atomic Transactions you’ll be presented with a page to select an SSL cert. You can either select one, of create a self signed cert, or request one later from a CA. I selected a self signed cert as I didn’t have one already installed. Click next, then install. If you didn’t install WS-Atomic Transactions the next screen will simply have the summary and install button.
It will take a while as Windows is going to install .NET 3.0 as part of this install. Now is a good time for coffee or a smoke.
After installation is complete click close.
If you click the plus sign next to Roles in the right hand menu you can navigate down to Roles > Application Server > Component Services > Distributed Transaction Coordinator > Local DTC.
Right click on Local DTC and select properties, then select the security tab. This tab looks very similar to the Windows 2003 one.
Check which ever boxes you need to in order to get the DTC setup to match the other servers in your environment. Be sure to enable “Allow Remote Clients” or client machines won’t be able to access the DTC on this machine. If you want to be able to remotely enable DTC from another machine you’ll need to check the “Allow Remote Administration”. It’s been my experience that once DTC is setup and working remove administration probably isn’t that important so I’d say leave that unchecked and simply RDP into the server if it needs to be worked on. Your screen will look something like this.
When you click OK DTC will prompt you to restart it. Click yes (unless you want to schedule the restart for another time) and DTC will restart. You can see the DTC events in the log by selecting the “Application Server” option from the menu on the left. You are going to want the most recent event to look something like this.
MSDTC started with the following settings:
Security Configuration (OFF = 0 and ON = 1):
Allow Remote Administrator = 0,
Network Clients = 1,
Trasaction Manager Communication:
Allow Inbound Transactions = 1,
Allow Outbound Transactions = 1,
Transaction Internet Protocol (TIP) = 0,
Enable XA Transactions = 1,
MSDTC Communications Security = No Authentication Required,
Account = NT AUTHORITY\NetworkService,
Firewall Exclusion Detected = 0
Transaction Bridge Installed = 1
Filtering Duplicate Events = 1
(Yes, I’m aware of the spelling issues in the error message. That’s a straight copy and paste from Windows 2008’s event log.)When configuring DTC on a cluster you only need to configure one node. This is because DTC is a cluster aware service so when you install DTC after setup clustering (or you setup clustering after installing DTC) the DTC service will already be setup as a clustered resource within the first cluster resource group created. When you configure DTC for network access on a cluster the settings are written to the Quorum drive as well as the system registry which allows both nodes to share the settings. If you have a cluster and you have to go through a firewall with DTC and have followed KB Article 250367 (I’m sure there is a Windows 2008 version of this KB article somewhere, I just haven’t run across it yet) you will need to have more than 20 ports available to DTC. This is because when you configure the DCOM protocols to use specific ports you are configuring all of RPC to use those specific ports. This means that the cluster administrator needs to use these ports as does the Component Services window which monitors for distributed transactions. When dealing with a cluster it is recommended that you have at least 100 ports open between the machines within the transaction. If you need to setup MS DTC to talk to another DTC coordinator then you will probably need to use the No Authentication Required setting unless they support the other options. Check with the vendor of the other coordinator to find out.
(These are the settings which you need no matter which version or edition of SQL Server you have installed.)
I’ve been asked to come back to the San Diego SQL Server Users Group on September 18, 2008. This time around I’ll be speaking about Federated Databases, and some various techniques which you can use to federate your systems.
I’m still finishing up the slide deck and demos. I’ll try and get them posted in advance. If I can’t I’ll post them shortly after.