SQL Server with Mr. Denny

Nov 1 2010   11:00AM GMT

What a difference a temp table makes over a table variable



Posted by: Denny Cherry
Tags:
SQL Server
Statistics
Table Variables
Temp Tables

I was working on performance tuning the data deletion process that I use to remove data from our database.  We delete massive amounts of data on a daily basis, so our data deletion process is pretty important to us.  As we’ve grown our customer base the amount of data to be deleted every day has also grown, and the amount of time to delete that data has gone up as well.

We it has started to take stupid amounts of time to delete the data, so I started digging into the data deletion procedures.

The first thing I looked at was the actual delete statements.  These seams ok, the actual deletes were happening very quickly (we process deletes in batches of 1000 records per batch to minimize locking of data).  So next I looked at the part of the code where we select the records to be deleted.  Looking at the execution plan, everything looked ok.

But this little chunk of code took about 50 minutes to run.  Pretty bad when only returning 1000 numbers back from the database.

SELECT TOP (@BatchSize) a.PolicyIncidentId
FROM PolicyIncident a WITH (NOLOCK)
JOIN #ComputersToProcess ComputersToProcess ON a.ComputerId = ComputersToProcess.ComputerId
WHERE CaptureTimestamp < ComputersToProcess.StartDeleteAt

The first thing that I did was put a primary key on the @ComputersToProcess table variable.  That turned the table scan into a Clustered Index Scan, but didn’t do anything for performance.

The next thing I did was switch the table variable to a temp table (without a primary key).  This really didn’t do anything to speed up the process as there is still no statistics on the data.  However this time the execution plan actually shows you that there’s no statistic on the temp table.

Now, I didn’t want to put at non-clustered index on the table keeping the table as a heap, and a clustered index that wasn’t a primary key wasn’t going be any more effective than a primary key, so I put a primary key on the table.  While the query cost percentage went up from 2% to 7% the actual run time went down from 50 minutes to just 1 second.

Now I didn’t make any other code changes to the procedures, just changing from the table variable to the temp table, and adding a primary key and this one little three line query went from an hour to a second.  Its amazing how much such a small change can make things run smoother.

Now obviously this isn’t going to fix every problem.  But in my case I’m putting a little over 190k rows into the table variable (now temp table) and this is just to much for the table variable to take.  Keep in mind that with table variables the SQL Server has statistics, but it assumes only a single row per temp table, no matter how much data is actually in the table variable.

Denny

1  Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • PaulWhiteNZ
    Hi Denny, Is that really a NOLOCK hint on that query? ;c) Did you capture an 'actual' execution plan from the run that took 1 second? I would have expected to see a hash or merge join rather than nested loops - and thicker connecting lines representing the higher row counts. Your last sentence says "...with table variables the SQL Server has statistics..." - perhaps that is a typo? Statistics are not created for table variables. Although statistical information is not available for table variables, if you use an OPTION (RECOMPILE) query hint, the run-time [I]cardinality[/I] of the table variable is available to the optimizer. Using this hint would have allowed the optimizer to 'see' the 190,000+ rows in the table variable, resulting in a much better plan. One other benefit to using OPTION (RECOMPILE) here is that the optimizer could see the run-time value of @BatchSize. Paul
    0 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: