DataManagement archives - SQL Server with Mr. Denny

SQL Server with Mr. Denny:

DataManagement

Sep 9 2009   1:55AM GMT

Whitepaper on why database maintenance is important



Posted by: mrdenny
SQL Server, Index Performance, Statistics, Performance Problems, SQL Server 2005, SQL Server 2008, SQL Server 2000, White Paper

Today I’m publishing a white paper which I wrote for our Customer Service department to give to our customers. It explains why database maintenance is so important in keeping your database happy and healthy. Continued »

Jun 9 2008   11:00AM GMT

SQL 2005 Internal Query Error 8623 still unfixed



Posted by: mrdenny
Data integrity, Data Normalization, Internal Query Processor, Error 8623, KB 940371, INSERT statement

While working on a new part of our production system  we begun getting the following error message back from the SQL Server.

Server: Msg 8623, Level 16, State 1, Line 1
Internal Query Processor Error: The query processor could not produce a query plan. Contact your primary support provider for more information.

Now I’ve seen this error before, but it was supposed to have been fixed in Cumulative Update 3 for SQL Server 2005 SP 2 but apparently it hasn’t been.  I’m currently running build 3239 which is Cumulative Update 7.  (I’m running the x64 edition.)

The only difference between my query and what’s documented in the KB article is that I was doing a basic insert statement.

 INSERT INTO Table3
 (Table1Id, Table2Id)
 VALUES
 (@Table1Id, @Table2Id)

There are obviously no complex queries being run here.  However I did have relational integrity enabled through foreign keys.  I ended up having to remove the foreign key constraints from this table for the insert statement to work correctly.

Hopefully someone finds this info useful.

You can follow along with the Microsoft development teams progress on Microsoft Connect.  It’s feedback number 332815.

Denny


May 12 2008   9:00AM GMT

Data Normalization, with storage being so cheap what’s the point?



Posted by: mrdenny
SQL, Data integrity, Data Normalization

Back in the old days one of the big reasons that people enforced such strict rules on data normalization was because it would greatly increase the amount of information you could fit on the hard drive.  Back when everything was measured in kilobytes or megabytes storing redundant information simply wasn’t an option.  You had to be able to cram as much information into that little database.

These days space isn’t the issue.  Storage is very cheep these days, and if you have the money you can get absolutely massive amounts of storage.  1 TB hard drives are easily found, and when you start talking about RAIDing drives together the storage limits start to become just insane.  100 TB of storage just isn’t what it used to be.

The problem now becomes that with so much data for the database platform to search through we need to make sure that the indexes which SQL Server is searching as as small as possible so that we can get those indexes read from the disk and into memory as quickly as possible.

Say we have an Employee table which looks like this.

CREATE TABLE Employee
(EmployeeId INT,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
Department NVARCHAR(100),
Manager_First_Name NVARCHAR(50),
Manager_Last_Name NVARCHAR(50),
Location NVARCHAR(50),
Active VARCHAR(5))

Now this table is clearly not setup as a normalized table.  We are assuming that this table has best case indexing.  If we have 1000 employees records and we are going to search the Department column the index must be read from disk.  Now assuming that the department names are nice long bureaucratic names which average 30 characters each we have about 30000 bytes (more if you take the fill factor into account) which needs to be read from the disk.  Now if we had a relational version of the table.

CREATE TABLE Employee
(EmployeeId INT,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
DepartmentID INT,
ManagerEmployeeID INT,
LocationId INT,
Active BIT)

When we search the Employee table we now are only loading an integer value from the disk into memory. This means that we only have to load 8000 bytes of data from the disk into memory. Now knowing that SQL Server reads data from the disk in 8k blocks even with a 50% fill factor the index only takes 2 blocks on the disk and can then be completed in two read operations to the disk. Now our original table with no fill factor will require 30 data pages (20.297 to be specific), about 60 if we use the same fill factor.

With tables of this size this isn’t much. But as your database grows the number of additional data pages which would have to be read for a simple search of the table grows exponentially.

Denny


Apr 17 2008   8:00AM GMT

SQL 2008 one click database encryption gives a false sense of security



Posted by: mrdenny
Compliance, Encryption, DataManagement, SQL Server 2008

While I think that the one click database encryption that Microsoft has built into SQL Server 2008 is a good idea, but I’m not sure how useful it actually is.  It’s touted as giving you data encryption of the entire database without any code change.

What this actually means is that if someone stops the SQL Server and copies the data files to another server and attaches the files to that SQL Server the files won’t be of any use.  This however isn’t the common way that data is stolen from a SQL Server.  A much more common method is someone logs into the database using a perfectly legit account and exports the table or tables which contain the useful information to Excel, Access, csv, etc.  This method of database encryption does nothing to protect against this as if you query the table directly you are displayed the data in clear text.

All this database encryption does (as it appears to me) is that it protects the data at rest (ie. when the SQL Server service is stopped) and does nothing else to protect the data.  If all you are looking for is a compliance check off saying that you data is encrypted for SOX, HIPPA, the California Online Privacy Protection Act of 2003, etc then this will probably do the trick.  If you want actual data encryption where the data is not viewable when you query the table unless you know the key to unlock the data this isn’t it.

If you do want that quick and easy encryption for a compliance check off be sure to check your data storage requirements and IO requirements.  Encrypted data is typically larger than unencrypted data and you may need to increase your allocated storage or IO requirements in order to have the encryption completed.  Also remember that the higher level of encryption you use the more CPU time is required to decrypt the data.

Feel free to share your opinions below.  I’m interested to here what others have to say on the topic.

Denny


Mar 27 2008   6:54PM GMT

New Article: SQL Server tempdb best practices increase performance



Posted by: mrdenny
DataManagement, Article, System Configuration

I’ve recently published a new tip on SearchSQLServer.com called “SQL Server tempdb best practices increase performance“.

Denny


Jan 11 2008   6:40PM GMT

New Article: Determining SQL Server database storage requirements



Posted by: mrdenny
SQL, DataManagement, SQL Server 2005, Config, Article, SQL Server 2008

I’ve published a new tip over on SearchSQLServer.com entitled “Determining SQL Server database storage requirements“.

Denny


Jan 3 2008   8:00AM GMT

Using a foreach loop to process an XML document.



Posted by: mrdenny
SQL, XML, DataManagement, SQL Server 2005, SSIS, SQL Server 2008

Receintly I was working on a project where I needed to use a foreach loop with an SSIS project, but couldn’t for the life of me get it to properly process the XML document which I was giving it.  Well with some major work and digging I was able to get it working correctly, but it took me for ever to get all the little setting correct so I figured that I’d throw the info up here for anyone else who is looking for it.

Some background on what the process is that I’m working on.  Basically I’ve got a table with catagorized data in it.  I need to export all the data from the table info one file per catagory (don’t ask, I didn’t design it, I’ve just got to automate it; and it actually makes sence in the grant schem of things).  Well I figured that the easiest way to do this was to use a foreach loop and give it an XML document with the list of catagories to process.  (This was better than looping through and getting the next value from the database.)

 So needless to say, I get started on my little process.

The query which I’m using within an Execute SQL Task is below.  The Execute SQL Task puts the XML data into an SSIS variable called v_CategoryList with a data type of string.

SELECT WebsiteBlockCategoryId as id
FROM dbo.WebsiteBlockCategory cat with (nolock)
where WebsiteBlockCategoryId  <> 0
for XML AUTO

The XML document looks like this. (SSIS seams to be wrapping it within <ROOT></ROOT> tags for me which is why I’m not doing it my self.)  My XML document is actually must longer than this, but you get the idea.  It’s a very basic XML document.

<ROOT>
 <cat id=”18″ />
 <cat id=”19″ />
 <cat id=”20″ />
 <cat id=”21″ />
 <cat id=”22″ />
 <cat id=”23″ />
 <cat id=”24″ />
 <cat id=”25″ />
 <cat id=”26″ />
 <cat id=”27″ />
 <cat id=”28″ />
</ROOT>

As you can see on the screenshot of the forloop properties I’ve set the source to be my variable and the EnumerationType to Element Collection.  Since I know where the data is within the XML document I use DirectInput for both the outer and Inner XPath strings.  For the Outer XPath string I’m using “//cat”.  Because I’m not putting in the ROOT level name it doesn’t matter what gets put in there as long as there is a parent level.    For the Inner XPath string I’ve got the Element name with an @ sign in front of it “@id”. 

Collection Screen

Over on the Variables page of the UI I’ve got my v_CategoryId variable mapped to Index 0 of my document.  If you need to output more than one element from your XML document set your Inner XPath to “*”.  This “should” allow you to bring back all the elements and refer to them by index number starting with 0.  I havn’t actually tried this, as I’ve always only needed a single element hense the “should”.

Variable Mappings Screen

And to think, that little thing took me a couple of days to get working right.  I can only imagine trying to do this in SQL 2000 and DTS.

Denny


Dec 13 2007   8:00AM GMT

So what’s the difference between null and a blank field



Posted by: mrdenny
SQL, T/SQL, Data Types, NULL, Data integrity

There is always a discussion going on somewhere over having NULL values in a database.  Some people (such as myself) are fine with NULL values.  Some people don’t like having NULL values in there database for some reason.  I’ve heard all sorts of reasons; “The screw up performance”, “They are hard to deal with”, “I like using a default value better”, “Default values are easier to work with”.

The performance one is just bunk.  NULL values are indexed just like any other value.

The rest of the reasons to not use NULL values ignore the fact that a NULL value is a perfectly legit value for some fields.  Not all fields should allow a NULL value to be placed in them.  If you are building a shopping cart and have a line items table, the Quantity and Price fields probably shouldn’t allow nulls as you should know the price and quantity when creating the invoice.  However in the same shopping card application the users phone number may need to allow NULL values.

Something which is key to remember about NULL values is that NULL doesn’t mean nothing.  It means unknown.  There may be a value, there may not, we simply do not know what that value is.  Putting in some default value will then put a value on the field when there shouldn’t be.  A great example which I read recently involved a hypothetical database at a blood bank.  A person comes in needing blood.  They are entered into the system but the recipients blood type isn’t known.  If you put in a default value of a blank value when you go to query the supply table, if there are pints of blood which have not been typed they will also have a blank value (as that is your standard default value).  You’ll get a match when you run the query and possibly be giving the recipient blood which is of the wrong type.  Now if you were using NULL values in the database the query would return no matching values as NULL <> NULL (under the default settings of Microsoft SQL Server).  Granted we should be able to trust that the person give the recipient the blood would check the typing, but would you rather trust a person or the computer.  People are overworked and usually underpaid, and nurses and doctors make simple mistakes just like the rest of us.  Personally I’d rather know that the system wouldn’t find a match rather than have to worry that I’d be getting the wrong type of blood.

If you really want to display that default value when the data is returned that’s what the ISNULL system function is for.  This way you don’t have to write long case statements around each field which allows NULL.

NULL values have there place, just like data does.  NULL values don’t mean that we don’t care about the data, it means that we simply don’t know the value of the data as of yet.  It shows that we care about the system and the people using it.  How annoying is it to look at a form which is full of default values and having to read through all of them instead of simply looking at the ones which are blank.


Nov 15 2007   7:23PM GMT

New tip published: Update SQL Server table statistics for performance kick



Posted by: mrdenny
SQL, T/SQL, Statistics, Article

I’ve published a new tip over at SearchSQLServer.com entitled Update SQL Server table statistics for performance kick.

 Denny


Oct 25 2007   6:03PM GMT

New tip published about table statistics



Posted by: mrdenny
SQL, T/SQL, Statistics, Article

My next tip on SQL Server Statistics has been published on SearchSQLServer.com entitled Update SQL Server table statistics for performance kick.