Here’s a handy query which I’ve written which I’ve used to see how much of each table is loaded into the SQL Server Buffer Cache. It shows you how much data is in memory, how much space is used on disk, and the percentage of data in memory compared to the amount of data on disk.
SELECT sys.tables.name TableName,
sum(a.page_id)*8 AS MemorySpaceKB,
SUM(sys.allocation_units.data_pages)*8 AS StorageSpaceKB,
CASE WHEN SUM(sys.allocation_units.data_pages) <> 0 THEN SUM(a.page_id)/CAST(SUM(sys.allocation_units.data_pages) AS NUMERIC(18,2)) END AS 'Percentage Of Object In Memory'
FROM (SELECT database_id, allocation_unit_id, COUNT(page_id) page_id FROM sys.dm_os_buffer_descriptors GROUP BY database_id, allocation_unit_id) a
JOIN sys.allocation_units ON a.allocation_unit_id = sys.allocation_units.allocation_unit_id
JOIN sys.partitions ON (sys.allocation_units.type IN (1,3)
AND sys.allocation_units.container_id = sys.partitions.hobt_id)
OR (sys.allocation_units.type = 2 AND sys.allocation_units.container_id = sys.partitions.partition_id)
JOIN sys.tables ON sys.partitions.object_id = sys.tables.object_id
AND sys.tables.is_ms_shipped = 0
WHERE a.database_id = DB_ID()
GROUP BY sys.tables.name
When you run this code, some of the smaller objects will show more pages in memory than on disk. It appears that this is normal. My guess is that SQL Server is preallocating data pages so that if data is added to these pages there is space allocated to them already in order to save time when adding data to the table.
I’ve removed the system objects from this query, as you can’t do anything to optimize them.
Microsoft and PASS have announced that you can take a free Microsoft Certification exam. Visit the PASS Community Summit 2008 Microsoft Learning Certification Center Registration Site for more information.
I have just registered for one, so thier appear to still be free registrations available.
This program is only available for people who are going to PASS, and you have to take the exam at the PASS summit (which requires attentance to the summit).
See you at PASS.
Take a look at the execution plan for your query. You’ll notice that the query is doing an Index Scan (or a table scan), not an Index Seek which is why the SELECT COUNT(*) takes so long. The reason for this is that the COUNT(*) function needs to look at every record in the table.
As a workaround you can use the technique that Microsoft uses to quickly display the number of rights when you right click on the table and select properties.
select sum (spart.rows)
from sys.partitions spart
where spart.object_id = object_id('YourTable')
and spart.index_id < 2
You should find that this returns very quickly no matter how many tables you have.
If you are using SQL 2000 still you can use the sysindexes table to get the number.
where id = object_id('YourTable')
This number may be slightly off depending on how often SQL updates the sysindexes table, but it’s usually corrent (or at least close enough).
(I was informed that my prior information on this post was not quite accurate, so I have revised the post accordingly. In a nutshell when doing a SELECT count(*) FROM Table even if the row contains all NULLs the record is still counted.)
Well today is the one year birthday of my blog. It’s been a very fun and exciting year. Here’s hoping that the next year will be just as fun.
The company I work for just finished our office move into our newer slightly larger office. I’ve managed to snag myself a Window seat this time around. I just wanted to share the view with everyone. This is the view out my desk window.
Some times its great to live in Southern California.
Orange County now has its own SQL Server User Group. It is being run as a part of the Orange County .NET Users Group. The first speaker was my friend Lynn Langit last month on August 14th. I’ve been asked to be the next speaker on October 16th. The meeting starts at 6pm and runs until 9pm. The meeting is being held at New Horizons in Anaheim at 1900 S. State College Blvd near the corner of Katella Ave.
I’ll be talking about Clustering SQL Server. It is going to be similar to the session about Clustering SQL Server which I’ve given at the prior SoCal Code Camps. If you would like a preview of the slide deck it is available for download. If I change the slide deck any I’ll update the version available for download.
If you are not able to make it to the session feel free to post your questions below and I’ll answer them as quickly as I can.
With SQL Injection Attacks being all the rage these days, I’ve been asked a couple of times for T/SQL code to clean up the database.
So I threw this code together to clean up the data. This code will clean all the character and uni-code columns in all the user defined tables in the system. You’ll need to be dbo or sysadmin to run this without error. If you have TEXT or NTEXT columns it will through an error for those columns. Cleaning TEXT and NTEXT columns is a little more complex as you can’t use the REPLACE function on a TEXT or NTEXT datatype.
DECLARE @sql NVARCHAR(4000) DECLARE @InsertedValue NVARCHAR(1000) SET @InsertedValue = 'The Script tags which were inserted' DECLARE cur CURSOR FOR select 'update [' + sysusers.name + '].[' + sysobjects.name + '] set [' + syscolumns.name + '] = replace([' + syscolumns.name + '], ''' + @InsertedValue + ''', '''')' from syscolumns join sysobjects on syscolumns.id = sysobjects.id and sysobjects.xtype = 'U' join sysusers on sysobjects.uid = sysusers.uid where syscolumns.xtype in (35, 98, 99, 167, 175, 231, 239, 241, 231) OPEN cur FETCH NEXT FROM cur INTO @sql WHILE @@FETCH_STATUS = 0 BEGIN exec (@sql) FETCH NEXT FROM cur INTO @sql END CLOSE cur DEALLOCATE cur
Hopefully you find this useful. If you need code for TEXT or NTEXT columns just post a comment and I’ll throw something together.
This code will work on SQL 2000 and up (it’ll probably work on SQL 7 as well, but I don’t have a SQL 7 machine to test against).
This morning I released another tip on SearchSQLServer.com called “SQL Server consolidation: Why it’s an optimization technique“. It’s a quick overview on consolidating SQL Server Instances onto a single server along with some of the gotchas.
When dealing with a high load replication environment network latency and network traffic can become as issue. There are a few settings you can play with, within the subscription agents themselves such as the –PacketSize, -UseInprocLoader, and –SubscriptionStreams. While working with these settings may or may not help (personally I’ve had very limited success getting them to improve the data flow rate) there are some other techniques you can use.
The best technique to you is to include your stored procedures as articles within the publication. This will allow SQL Server to instead of replicating the entire record from one server to another, it will simply run the stored procedure on the subscriber with the same input settings as it was run with on the publisher. If you have very complex stored procedures, you’ll want to evaluate the increase load on the subscribers CPU against the reduction of network traffic between the distributor and subscriber.
This technique is critical to use to reduce the network traffic if you have a multi-tier replication topology. This is where your subscribers are also publishers to subscribers of their own. This is because of the way that the replication writes data to the subscriber. Replication uses stored procedures on the subscriber to update the values of the record. Assuming a table which looks like this:
CREATE TABLE Employee
(EmployeeId INT IDENTITY(1,1),
The stored procedure which Microsoft uses for replication will look like this.
CREATE PROCEDURE MS_Employee_Update
SET EmployeeId = CASE WHEN @Columns & 1 = 1 THEN @EmployeId ELSE EmployeeId END,
LastName = CASE WHEN @Columns & 2 = 2 THEN @LastName ELSE LastName END,
FirstName = CASE WHEN @Columns & 4 = 4 THEN @FirstName ELSE FirstName END,
DepartmentId = CASE WHEN @Columns & 8 = 8 THEN @DepartmentId ELSE DepartmentId END
WHERE EmployeeId = @KeyField
While this technique works great for a two tier replication topology, we suddenly have a problem when we try to replicate from the second tier to the third tier. On the second tier all the columns have been updated, no matter which columns have actually been updated. This will cause an undo amount of replication traffic between the second and third tier servers as every column must now be replicated instead of just the values which have changed.
When you replicate the stored procedures as well as the tables when the stored procedure is fired to update the data on the publisher, that same stored procedure will be fired on the subscriber thereby reducing the amount of data which needs to be sent from the second tier to the third tier as again only the stored procedure execution is replicated. On very wide tables this reduction of data can reduce your network traffic load up to 70% (or higher depending on the data being changed and the width of the table).
You can verify that SQL Server is indeed using the stored procedures by using SQL Profiler to profile the distribution process on the subscriber. If you see your stored procedure names being fired then it’s using this technique. If you see procedures named by Microsoft, then it is not.
This technique will only work, if your application has no inline SQL. If you use a combination of stored procedures and inline SQL within the application, the data which is changed by stored procedure execution will be replicated using this technique; while the data which is changed by the inline code will be replicated using the traditional technique. If you have dynamic SQL within your stored procedures the SQL Server will not be able to use the replicated procedure to update the data on the subscriber. The dynamic SQL will prevent the SQL Server from recognizing that the stored procedure is what caused the data change, and the data will be replicated using the normal technique.