Query Tuning archives - SQL Server with Mr. Denny

SQL Server with Mr. Denny:

Query tuning

Nov 17 2008   8:00AM GMT

I’ll be speaking at the Inland Empire .Net User Group (reminder)



Posted by: mrdenny
In Person Events, Resource Governor, Query tuning

The nice folks at the Inland Empire .NET User Group have invited me to come and speak to them.  This is a great speaking opportunity for me as they are about 20 minutes from my house.

I won’t be speaking there until December 9, 2008 (it’s amazing just how far in advance some of this stuff gets scheduled).

I’ll be giving two presentations at the meeting.  The first will be the ever popular Query Tuning, and the second will be a talk on the SQL Server 2008 Resource Governor.  The address and directions to the meeting can be found on the IE .NET User Group web site.  If you are going to attend there meetings they have an RSVP link on the site.

Denny

Oct 13 2008   11:01AM GMT

Why is my SELECT COUNT(*) running so slow?



Posted by: mrdenny
SQL, Query tuning

Take a look at the execution plan for your query. You’ll notice that the query is doing an Index Scan (or a table scan), not an Index Seek which is why the SELECT COUNT(*) takes so long. The reason for this is that the COUNT(*) function needs to look at every record in the table.

As a workaround you can use the technique that Microsoft uses to quickly display the number of rights when you right click on the table and select properties.

select sum (spart.rows)
from sys.partitions spart
where spart.object_id = object_id(’YourTable’)
and spart.index_id < 2

You should find that this returns very quickly no matter how many tables you have.

If you are using SQL 2000 still you can use the sysindexes table to get the number.
select max(ROWS)
from sysindexes
where id = object_id(’YourTable’)

This number may be slightly off depending on how often SQL updates the sysindexes table, but it’s usually corrent (or at least close enough).

Denny

(I was informed that my prior information on this post was not quite accurate, so I have revised the post accordingly.  In a nutshell when doing a SELECT count(*) FROM Table even if the row contains all NULLs the record is still counted.)


Jul 21 2008   11:00AM GMT

A better way to index text data



Posted by: mrdenny
SELECT statement, T/SQL, Uni-code, Query tuning, Tables, Index Performance

Indexing text data (varchar, nvarchar, char, etc) is a good way to make it faster to find the data you are looking for.  However these indexes can end up being very hard on the disks behind the index, as well as the memory of the server.  This is because of the large amount of data being put in the index.

As an example, let’s say that we have a table like this.

CREATE TABLE Employee
(EmployeeID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
EmailAddress VARCHAR(255))

Now assume that you want to be able to search by the EmailAddress field.  We will then want to index the EmailAddress field with a non-clustered index.  If we work for a company like AMD, then our email addresses will be pretty short (f.lastname@amd.com).  However if we work for a company like I work for then the email addresses are a bit longer (flastname@awarenesstechnologies.com).  Now when we index this column we will be putting the entire email address into the index, taking up a lot of space within the index; especially compared to a numeric value such as an integer.  This becomes doubly true if you are using a uni-code data type as each character requires two bytes of storage instead of the usual one.

This also becomes a problem if you are working on a system with URLs in the field to be indexes.  Depending on the length of the URL, the values may be longer than is allowed in an index which could then give you sorting problems on the indexes.

There are a couple of variations on this technique which I’ve seen.  The one I’ve used the most is to use the CHECKSUM function as part of a calculated column, and then index the calculated column.  This way you simply get the CHECKSUM of the value you want to find, and search the calculated column.  As we are now have an index made up of integers the index can fit a lot more data on each physical data page reducing the IO cost of the index seek as well as saving space on the disk.

So doing this turns our table into something like this.

CREATE TABLE Employee
(EmployeeID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
EmailAddress VARCHAR(255),
EmailAddressCheckSum AS CHECKSUM(EmailAddress))

Now I wouldn’t recommend using this technique for each table you create.   I usually only recommend a technique like this when the value to be indexes won’t fit within the bounds of the index, or the table will be very large and searched often so the memory saved is worth the extra CPU time of having to hash the values before doing the lookup.

Now there are a couple of gotchas with this technique.  If you are check summing domain names, some characters don’t check sum correctly.  Also check summing a Unicode version of a string will give you a different result than the non-unicode version of the same string.

You can see that with these three SELECT statements.

SELECT CHECKSUM(’google.com’), CHECKSUM(’g-oogle.com’)
SELECT CHECKSUM(’google.com’), CHECKSUM(N’google.com’)
SELECT CHECKSUM(N’google.com’), CHECKSUM(N’g-oogle.com’)

As you can see the first one you get two different values as you would expect ( 1560309903 and 1560342303 respectively).  With the second query you get two very different values between the Unicode and character strings (1560309903 and -1136321484 respectively).  Based on the first query you would expect to get two different values for the third query, but you don’t.  With the Unicode strings the - appears to not count as part of the CHECKSUM giving you the same CHECKSUM value for both strings (-1136321484).

Another version of this technique which Kevin Kline talked about recently uses the HASHBYTES function of SQL Server 2005 to get the hash of a column and use that.  In his blog he’s talking about using it for partitioning a table, but that same technique can be used here as well.

CREATE TABLE Employee
(EmployeeID INT,
FirstName VARCHAR(50),
LastName VARCHAR(50),
EmailAddress VARCHAR(255),
EmailAddressCheckSum AS HASHBYTES(’SHA1′, EmailAddress)

This will however give you a longer string, therefor taking up more space within the index. However if working with long Unicode strings this may be a better option for you to use.

Denny


Jul 14 2008   11:00AM GMT

Using batching to do large operations without filling the transaction log



Posted by: mrdenny
SQL, DELETE statement, Query tuning, Tables

Deleting large amounts of data from a table is usually an important task, but if you don’t have a maintenance window to work within then you can end up causing locking and blocking against the other processes which are accessing the database.  Not to mention you will cause a very large amount of data into the transaction log no matter what your transaction logging level is set to.

Say you have a table with a date column and you need to delete a million plus records.  Doing this in a single transaction will put all million transactions into your transaction log, plus cause any other processes which are trying to access the table to be blocked.

 However if we batch the transaction into many smaller transactions our transaction log will not fill up as we can backup the log using our normal log backup methods throughout the process, or if we use SIMPLE recovery on our database then transactions will be removed from the log automatically.

In SQL 2000 and below you have to set the ROWCOUNT session variable to a number above 0, which would cause SQL to delete the first n records that it comes across.  In SQL 2005 we can use the TOP parameter as part of our DELETE command having the same effect, but without having to reset the session variable.

In SQL 2000 or below you can use a syntax like this one.

DECLARE @c BIT, @d DATETIME
SET @d = dateadd(yy, -1, getdate())
SET @c = 0
SET ROWCOUNT = 1000
WHILE @c = 0
BEGIN
DELETE FROM Table
WHERE CreateDate

If you are using SQL 2005 you can use this very similar syntax.

DECLARE @c BIT, @d datetime
SET @d = dateadd(yy, -1, getdate())
SET @c = 0
WHILE @c = 0
BEGIN
DELETE TOP (1000) FROM Table
WHERE CreateDate

Both pieces of code are very similar. Declare a variable which tells the loop when to exit. Then start deleting the data. If no records are deleted, then set the variable to 1 causing us to exit the loop. Now this will usually take a little bit longer to complete than a single delete statement, but the system will continue to be responsive during the process.

The number of records which you are deleting should be adjusted based on the width of your records, the load on the database at the time of deletion, and the speed of your hard drives which hold the data files and transaction logs. I usually start at 1000 records and see how the system responds. For tables which a just a few numbers I’ll put it up as high as 50k or 100k records. For very wide tables I’ll drop it down to 100 or 500 records if the system can’t handle 1000 records.

Denny

Update:
Sorry this post didn’t look very good at first. The blog site seams to have eaten it, and I didn’t notice until just now.


May 5 2008   9:00AM GMT

I’ll be speaking at the Inland Empire .Net User Group



Posted by: mrdenny
In Person Events, Resource Governor, Query tuning

The nice folks at the Inland Empire .NET User Group have invited me to come and speak to them.  This is a great speaking opportunity for me as they are about 20 minutes from my house.

I won’t be speaking there until December 9, 2008 (it’s amazing just how far in advance some of this stuff gets scheduled).

I’ll be giving two presentations at the meeting.  The first will be the ever popular Query Tuning, and the second will be a talk on the SQL Server 2008 Resource Governor.  The address and directions to the meeting can be found on the IE .NET User Group web site.  If you are going to attend there meetings they have an RSVP link on the site.

I’ll post about this meeting again closer to the meeting.

Denny


May 1 2008   9:00AM GMT

Checking the cached execution plans



Posted by: mrdenny
Cache, SQL Server 2005, Execution Plan, Query tuning, sys.dm_exec_plan_attributes, sys.dm_exec_query_plan, sys.dm_exec_requests, sys.dm_exec_cached_plans

As we all know SQL Server, will for good or bad, cache execution plans.  Up until now it’s been very tough to see the cached execution plans.  You pretty much had to grab the query or procedure (with the parameters) and run it with Query Analyzer and have it return the execution plan.  The only other option was to use SQL Profiler to capture the query along with the execution plan which it used.

 Starting in SQL Server 2005 you now have the ability to query the system catalog and see what query plans are stored in the cache, as well as the queries which trigger that plan to be used, as well as how many times that cached plan has been used.

The first DMV to look at is sys.dm_exec_cached_plans.  This DMV gives you the size of the plan, the handle of the plan (which is binary), and the number of times the plan has been used.

The next DMV to look at is sys.dm_exec_requests.  This DMV gives you the currently running processes, wait information, all the session settings for the processes, etc.  It also includes the handle for all the plans which are currently being run.  If you join this sys.dm_exec_requests to sys.dm_exec_cached_plans you can see exactly which processes are using which plans.  Because this shows you the spid (the session_id column) and the plan_handle you can see what users are running which commands.

The next object is a system table function called sys.dm_exec_query_plan.  This function accepts a single input parameter of a plan handle.  There are several columns in the output, the most important of which is the query_plan.  This is the actual xml plan which is stored in the SQL Server.

The last object we’ll be looking at is the sys.dm_exec_plan_attributes table function.  This function also accepts a single input parameter of a plan handle.  This function returns the various session settings which were in place when the plan was created.  This is important information to have when working with query plans, as changing a single ANSI connection setting will force SQL to create a new execution plan.

Don’t forget how to view the XML execution plan in the UI.

Denny


Apr 7 2008   11:00AM GMT

SQL Server 2008 changes the way that CONVERT/CAST works



Posted by: mrdenny
SQL, Query tuning, SQL Server 2008

Normally when running a query against a table and using a CAST or CONVERT function against a datetime field any index is made useless.  In SQL Server 2008 this problem is fixed.  Microsoft has come up with a way for SELECT statements which use CAST or CONVERT against a column of the datetime datatype to continue to use the index.  Now keep in mind that this only works for the datetime datatypes and not other data types.

I believe that this feature showed up in CTP 5 (November).

Denny


Mar 27 2008   8:21PM GMT

Joins vs. Exists vs. IN: Not all filters are created the same.



Posted by: mrdenny
SQL, Query tuning

Not all filter commands are created equal.  Different filtering operations should be used at different times to get the best performance our of your SQL Server.

While the JOIN, EXISTS and IN filters can give you the same results the way that SQL Server gets to the data is very different and can lead to poor system performance.  Also when doing a select vs. a delete these same operators will give different execution plans.

I’ll refer you do this file which will provide you with some sample code which can be run against the AdventureWorksDW sample database.  Run each query with the execution plan being displayed.  You’ll see that the IN and EXISTS both produce the same plan, while the JOIN produces a better plan when it comes to selecting data.  However when it comes to deleting the data the EXISTS and IN produce a better plan than the JOIN command does.  (Don’t worry, these delete scripts won’t actually remove any data from the table.  The data these scripts try to delete doesn’t actually exist.  We are looking for execution plans here, not actual deletes).

Denny


Mar 21 2008   6:11AM GMT

I had a great time speaking at the San Diego SQL User Group



Posted by: mrdenny
Service Broker, In Person Events, Query tuning

I’d like to thank the San Diego SQL Server User Group for having me come and speak to them tonight.  I had a great time presenting both my SQL Server Query Tuning and SQL Server Service Broker presentations.  You can grab the slide deck and sample code from those two links.

 I was happy to fill in on short notice for them when there scheduled speak cancelled on them.  Hopefully the members liked the presentations as much as I liked giving them.  Hopefully the San Diego SQL Server User Group will invite me back in the future.

Denny


Mar 18 2008   10:24AM GMT

Slide Decks and Sample Code for San Diego SQL Users Group Meeting



Posted by: mrdenny
Service Broker, In Person Events, Query tuning

Below you will find links to download file files with the MS PowerPoint slide decks and sample files which I’ll be using at this weeks presentation at the San Diego SQL Users Group Meeting.

There is a lot of sample code in the Query Tuning file.  We’ll look through as much of it as we can depending on the amount of time that we have.

SQL Server Query Tuning

SQL Server Service Broker

I’ve included the Service Broker slide deck even though I’m not sure that we will get to it.  If there isn’t time to get to the slide deck during the meeting I’ll be happy to answer questions on it after the meeting (or posted here of course).

See you Thursday night.

Denny