SQL Server with Mr. Denny:

CTE

May 4 2009   11:00AM GMT

How can I remove duplicate records in my tables?



Posted by: mrdenny
T/SQL, CTE, DELETE statement, SQL Server 2008, SQL Server 2005

All to often we end up with duplicate rows in a table.  The best way to keep duplicate rows out of the database is to not let them in.  But assume that they are there.  This bit of sample code shows how to delete those duplicate rows quickly and easily in a single statement.  No temp tables required (I use a temp table to put the data into for example purposes).  This code is for SQL 2005 and up as it uses some features which were introduced in SQL Server 2005.  SQL Server 2000 would require a totally different technique.

CREATE TABLE #DuplicateRows /*Create a new table*/
(Col1 INT,
Col2 INT,
Col3 INT)

INSERT INTO #DuplicateRows /*Load up duplicate rows*/
SELECT 1,1,1
UNION ALL
SELECT 1,1,1
UNION ALL
SELECT 1,1,1
UNION ALL
SELECT 2,2,2
UNION ALL
SELECT 2,2,2
UNION ALL
SELECT 2,2,2

SELECT *
FROM #DuplicateRows; /*Check that the data is actually hosed*/

WITH Cleaning AS (SELECT ROW_NUMBER() OVER(ORDER BY Col1, Col2, Col3) as row,
Col1,
Col2,
Col3
FROM #DuplicateRows)

DELETE FROM Cleaning /*Delete the rows which are duplicates*/
WHERE Row NOT IN (SELECT row FROM  (SELECT Col1, Col2, Col3, MIN(row) row
FROM Cleaning a
GROUP BY Col1, Col2, Col3) b)

SELECT * /*Check the table to see that it is clean*/
FROM #DuplicateRows

DROP TABLE #DuplicateRows /*Clean up the table*/

Hopefully you find this code useful.

Denny

Mar 17 2008   11:00AM GMT

Back To Basics: Using Common Table Expressions



Posted by: mrdenny
SQL, SQL Server 2005, T/SQL, CTE, Back To Basics, SQL Server 2008, Common Table Expressions

CTEs (Common Table Expressions) are one of the very cool features introduced in SQL Server 2005.  In there simplest most common form, think of them as a temporary single use view who’s context is only within the command which follows them directly.  The syntax of a CTE is very basic.

WITH CTE_Name (ColumnName, ColumnName) AS
(SELECT *
FROM Table)
SELECT *
FROM CTE_Name

The list of column names as part of the CTE defination is optional.  If all the columns are named this portion is not needed.  Here is an example from the AdventureWorks database.

WITH EmployeeData AS
(
SELECT e.[EmployeeID]
,c.[Title]
,c.[FirstName]
,c.[MiddleName]
,c.[LastName]
,c.[Suffix]
,e.[Title]
AS [JobTitle] ,c.[Phone]
,c.[EmailAddress]
,c.[EmailPromotion]
,a.[AddressLine1]
,a.[AddressLine2]
,a.[City]
,sp.[Name]
AS [StateProvinceName] ,a.[PostalCode]
,cr.[Name]
AS [CountryRegionName] ,c.[AdditionalContactInfo]
FROM [HumanResources].[Employee] eINNER JOIN [Person].[Contact] c
ON c.[ContactID] = e.[ContactID]INNER JOIN [HumanResources].[EmployeeAddress] ea
ON e.[EmployeeID] = ea.[EmployeeID] INNER JOIN [Person].[Address] a
ON ea.[AddressID] = a.[AddressID]INNER JOIN [Person].[StateProvince] sp
ON sp.[StateProvinceID] = a.[StateProvinceID]INNER JOIN [Person].[CountryRegion] cr
ON cr.[CountryRegionCode] = sp.[CountryRegionCode])
SELECT *
FROM EmployeeData
WHERE CountryRegionName = ‘United States’

When done correctly CTEs can be used to link back to themselves to join child data up the chain so you can access the parent record. This is called a recursive common table expression and is done with a UNION ALL between two queries within the CTE like so.

WITH DirectReports(ManagerID, EmployeeID, EmployeeLevel) AS (
SELECT ManagerID, EmployeeID, 0 AS EmployeeLevelFROM HumanResources.EmployeeWHERE ManagerID IS NULL

UNION ALL
SELECT e.ManagerID, e.EmployeeID, EmployeeLevel + 1FROM HumanResources.Employee e
INNER JOIN DirectReports dON e.ManagerID = d.EmployeeID)

SELECT ManagerID, EmployeeID, EmployeeLevel
FROM DirectReports ;
GO

The first part of the UNION ALL command shows us the top level employees who have no manager. The second query is used to link back to the managers to show the employee information including how many levels down the chain the record is.

Extreme care must be used when using recursive common table expressions as doing this incorrectly can put the SQL Server into a never ending loop while SQL is trying to recurse up the never ending tree.

Denny


Dec 10 2007   8:00AM GMT

Temp Tables, Table Variables, and CTEs



Posted by: mrdenny
SQL, T/SQL, Temp Tables, Table Variables, CTE

There are some major differences between temp tables, table variables and common table expressions (CTEs).  Some of the big differences are:

Temp Tables vs. Table Variables

  1. SQL Server does not place locks on table variables when the table variables are used.
  2. Temp tables allow for multiple indexes to be created
  3. Table variables allow a single index the Primary Key to be created when the table variable is declared only.
  4. Temp tables can be created locally (#TableName) or globally (##TableName)
  5. Table variables are destroyed as the batch is completed.
  6. Temp tables can be used throughout multiple batches.
  7. Temp tables can be used to hold the output of a stored procedure (temp tables will get this functionality in SQL Server 2008).

Table variables and Temp Tables vs. CTEs

  1. CTEs are used after the command which creates them.
  2. CTEs can be recursive within a single command (be careful because they can cause an infinite loop).
  3. Table variables and Temp Tables can be used throughout the batch.
  4. The command before the CTE must end with a semi-colon (;).
  5. As Temp tables and table variables are tables you can insert, update and delete the data within the table.
  6. CTEs can not have any indexes created on them, source tables much have indexes created on them.

If you can think of anything that I’ve missed, feel free to post them in the comments.

 Denny