SQL Server with Mr. Denny


January 11, 2008  6:40 PM

New Article: Determining SQL Server database storage requirements



Posted by: Denny Cherry
Article, Config, DataManagement, SQL, SQL Server 2005, SQL Server 2008

I’ve published a new tip over on SearchSQLServer.com entitled “Determining SQL Server database storage requirements“.

Denny

January 10, 2008  8:00 AM

SQL CLR: The What, When, Why and How.



Posted by: Denny Cherry
CLR, Config, SQL

There are two camps when it comes to SQL CLR.  The DBA camp, which says don’t use it, it’ll kill your SQL Server and the Developer camp which says that it will save you loads of time and that you should use it for everything.  These two different camps also represent the different messages which Microsoft is giving out as well.  I highly recommend that DBAs sit in on a Microsoft developer track session and that developers sit in on a DBA track session some time.  You’ll see a wide difference in the message about CLR in the sessions.

SQL CLR while powerful can not save the world as we know it from T/SQL.  SQL CLR has it’s uses, and it has times when it is the wrong option.  It can most definitely not do everything.  First of all, you’ll notice that I call it SQL CLR not just CLR.  This is because only a subset of the .NET CLR is supported by SQL Server.  This subset it called the SQL CLR.  The official list of supported libraries can be found here.  If it isn’t on that list it’s not officially supported by Microsoft.  What do I mean by supported?  Well in basic terms if you are using a .NET CLR library which isn’t on that list and you start having performance issues, Microsoft can require that you remove the CLR code from the SQL Server before troubleshooting the issue.  The official support policy from Microsoft can be found in MSKB article 922672.

Now with all that said what exactly can SQL CLR functions and procedures be used for?

SQL CLR code is extremely powerful, and when it comes to some kinds of work can be much faster than T/SQL code.  Some examples are advanced math, string manipulation, string searching, pattern matching, etc.  Places where SQL CLR will be slower than native T/SQL will be when general data manipulation and data searching.  When writing SQL CLR functions and procedures do remember that if the .NET code has to go back to the database for any reason that data access will not be slower than doing the data access natively within T/SQL.  If nothing else you have to account for the additional time to connect to the SQL Server, check credentials and move into the correct database.  All of which takes time.  When updating records from .NET code you have to process the records one at a time.  While .NET is great at processing records row by row, SQL isn’t.  SQL will be very slow when it comes to actually processing the updates as it is optimized for record set processing not row by row processing.

Now that I know what I can do with it, when should I?

When deciding to write a .NET SQL CLR function or procedure you first need to decide; Is the SQL Server the right place to do this?  Often times the front end, or middle end (if you have an N tier application) may be a better place to do the work.  For example formatting phone numbers via a .NET function should be done on the front end, not in the database.  Encrypting a credit card number should be done in the middleware or the SQL Server (if you have a middleware layer do it there).  You probably don’t want to install the encryption method and certificate that you use to encrypt data on the end users computer.  The general rule that I like to live by is to put as much .NET code outside of the SQL Server as possible, either in the middleware or in the front end.  Now there are also times when putting the code within the SQL Server is the right place to put it.  Say that you are doing a data warehouse load and you need to parse the text of a NVARCHAR(MAX) field looking for specific key words or phrases, and if these key words or phrases exist you need to add rows somewhere else.  T/SQL may not be the best option for this.  A Full Text search will be inefficient as you will have to search the entire table every time.  A regular T/SQL LIKE command will be very slow as indexes can’t be used, and searching through large pieces of text takes SQL a lot of time.  But if you use a SQL CLR function this can be done with very little CPU time using what are called Regular Expressions.  Regular Expressions are a basic .NET function which allows you to quickly and easily search a block of text.  (Since I’m not a .NET programmer I’m not going to dive into using regular expressions, however here is an MSDN article on the topic.)

Other excellent uses of SQL CLR would include procedures or functions which involve advanced mathematics functions which SQL Server does not include.  These functions could be easily put into a SQL CLR procedure or function and executed within the .NET environment with the result then being used within the SQL Server.  While normally I would recommend putting this in the client tier or middle tier if the function or procedure was needed for in row processing of a query then the SQL Server may be the right place for the CLR code.

While the SQL CLR isn’t the end all solution that some people were looking for, and want it so desperately to be, when used correctly it can be an extremely powerful tool.  But it must be used carefully as when used incorrectly it can hamper performance of your SQL Server.

Denny


January 7, 2008  8:00 AM

XML in the database. What’s the big deal any way?



Posted by: Denny Cherry
Data Types, T/SQL, XML

So, we’ve had the ability to use XML within the SQL Server databases since SQL 2000 came out.  What’s the big deal you ask?  How can I use it to make my life easier you also ask?

When I first started looking into this whole XML thing I was thinking the same thing.  How is this blob of data going to make my life easier?  Well when used correctly it most certenly can.  In SQL Server (up through SQL 2005 at least) we don’t have a way to pass more than one record into a stored procedure.  If you have an order entry system and the custom wants to order 3 items you have to run three commands on the SQL Server, doing three inserts one at a time.  If you instead passed those three items from the UI to the database as an XML document you could then process all three items in a single command thereby improving performance.

declare @XML XMLSET @XML = '<ROOT><Customer ID="12">
<Order ID="7498">
<Item ID="415" Qty="1" />
<Item ID="87547" Qty="2" />
<Item ID="4125" Qty="1" />
</Order>
</Customer>
</ROOT>'
DECLARE @hDoc INT
exec sp_xml_preparedocument @hDoc OUTPUT, @XML
SELECT *
FROM OPENXML (@hDoc, '//Customer/Order/Item')
WITH (CustomerId INT '../../@ID',
OrderId INT '../@ID',
ItemId INT '@ID',
Qty INT '@Qty')
exec sp_xml_removedocument @hDoc

As you can see by the 0utput we have a table which we can process in bulk.  Instead of running a sample piece of code if we were to build this into a stored procedure we could do something like this.

Here is our LineItems table.

CREATE TABLE LineItems
(OrderId INT,

ItemId INT,
Qty INT,
ItemPrice NUMERIC(6,2))

And here is our procedure which adds the XML data to the LineItems table.  In a production system business logic would need to be added to ensure stock on hand, and to return estimated ship dates to the client.
CREATE PROCEDURE usp_AddLineItems
@Items XML
AS
DECLARE @hDoc INT

exec sp_xml_preparedocument @hDoc OUTPUT, @Items


INSERT INTO LineItems
SELECT OrderId, ItemId, Qty, ItemPrice
FROM OPENXML (@hDoc, '//Customer/Order/Item')
WITH (CustomerId INT '../../@ID',
OrderId INT '../@ID',
ItemId INT '@ID',
Qty INT '@Qty',
ItemPrice NUMERIC(6,2)
'@ItemPrice')
exec sp_xml_removedocument @hDocGO

Since we don’t want to create a web based order system for this little demo here we can run the procedure and verify the output.

declare @XML XMLSET @XML = '<ROOT><Customer ID="12">
<Order ID="7498">
<Item ID="415" Qty="1" ItemPrice="12.95"/>
<Item ID="87547" Qty="2" ItemPrice="16.25"/>
<Item ID="4125" Qty="1" ItemPrice="8.25"/>
</Order>
</Customer>
</ROOT>'
EXEC usp_AddLineItems @XMLGO
SELECT *
FROM LineItemsGO

As we can see this is much more efficient than having to run three insert statements as they would require at least three rights to the disk, plus how ever many reads are needed to do the business logic. By using the XML we can do it all with a single write to the disk, and a single set of reads for the business logic. In this example we have reduced our disk activity by 33%. If we were processing 10 line items we would have reduced our disk activity 90%. And since disk performance is usually the bottleneck of any database reducing the disk IO any way we can is a good thing.

Denny


January 3, 2008  8:00 AM

Using a foreach loop to process an XML document.



Posted by: Denny Cherry
DataManagement, SQL, SQL Server 2005, SQL Server 2008, SSIS, XML

Receintly I was working on a project where I needed to use a foreach loop with an SSIS project, but couldn’t for the life of me get it to properly process the XML document which I was giving it.  Well with some major work and digging I was able to get it working correctly, but it took me for ever to get all the little setting correct so I figured that I’d throw the info up here for anyone else who is looking for it.

Some background on what the process is that I’m working on.  Basically I’ve got a table with catagorized data in it.  I need to export all the data from the table info one file per catagory (don’t ask, I didn’t design it, I’ve just got to automate it; and it actually makes sence in the grant schem of things).  Well I figured that the easiest way to do this was to use a foreach loop and give it an XML document with the list of catagories to process.  (This was better than looping through and getting the next value from the database.)

 So needless to say, I get started on my little process.

The query which I’m using within an Execute SQL Task is below.  The Execute SQL Task puts the XML data into an SSIS variable called v_CategoryList with a data type of string.

SELECT WebsiteBlockCategoryId as id
FROM dbo.WebsiteBlockCategory cat with (nolock)
where WebsiteBlockCategoryId  <> 0
for XML AUTO

The XML document looks like this. (SSIS seams to be wrapping it within <ROOT></ROOT> tags for me which is why I’m not doing it my self.)  My XML document is actually must longer than this, but you get the idea.  It’s a very basic XML document.

<ROOT>
 <cat id=”18″ />
 <cat id=”19″ />
 <cat id=”20″ />
 <cat id=”21″ />
 <cat id=”22″ />
 <cat id=”23″ />
 <cat id=”24″ />
 <cat id=”25″ />
 <cat id=”26″ />
 <cat id=”27″ />
 <cat id=”28″ />
</ROOT>

As you can see on the screenshot of the forloop properties I’ve set the source to be my variable and the EnumerationType to Element Collection.  Since I know where the data is within the XML document I use DirectInput for both the outer and Inner XPath strings.  For the Outer XPath string I’m using “//cat”.  Because I’m not putting in the ROOT level name it doesn’t matter what gets put in there as long as there is a parent level.    For the Inner XPath string I’ve got the Element name with an @ sign in front of it “@id”. 

Collection Screen

Over on the Variables page of the UI I’ve got my v_CategoryId variable mapped to Index 0 of my document.  If you need to output more than one element from your XML document set your Inner XPath to “*”.  This “should” allow you to bring back all the elements and refer to them by index number starting with 0.  I havn’t actually tried this, as I’ve always only needed a single element hense the “should”.

Variable Mappings Screen

And to think, that little thing took me a couple of days to get working right.  I can only imagine trying to do this in SQL 2000 and DTS.

Denny


December 31, 2007  8:00 AM

Have a Happy New Year



Posted by: Denny Cherry
Family

This will by the last blog entry for the year (this being new years eve and all). 

I hope that everyone has a happy new year and that no one needs to have to work tonight.

Don’t forget to spend tonight with family and friends, and not your computer working.

Denny


December 27, 2007  8:00 AM

Avoiding SQL Injection Attacks



Posted by: Denny Cherry
Attack Prevention, Security, SQL, T/SQL

The most common way for people to insert invalid data or cause damage to your database is through what is called a SQL Injection Attack.  This is when malicious code is placed within the responses which are expected and that code is not caught and it instead executed.  Depending on what level of access to the SQL Server your application account has will determine how successful the injection attack will be.

There are ways to protect your self.  The first and most important is that the account which the application uses to connect to the database should have the least set of rights necessary to do it’s job.  Don’t make the account dbo or sysadmin just so that everything works correctly without you having to modify the rights when you add objects.  The second requires your front end developers (the ASP, .NET, etc folks) to understand the potential problem so that they code the application in such a manor that the risk is reduced or removed.

(Please keep in mind that I’m not an application coder so my application code shown below may not be perfect.)

If they use in-line T/SQL commands such as this:

dim v_ado, v_sql, v_conn
v_conn = "Provider=sqloledb;Data Source=sqlserver;Initial Catalog=AdventureWorks;User ID=UserName;Password=password;"
v_sql = "exec usp_Something @var1='" & request.form("Var1") & "', @var2='" & request.form("Var1")
set v_ado = CreateObject("ADODB.RecordSet")
on error resume next
v_ado.open v_sql, v_conn, 3, 3
...

Then you are open to an injection attack.  This is because I can simply put “‘; SELECT * FROM sys.tables” as one of the form fields which I am submitting back to you and my SELECT statement will be executed against your SQL Server.  Depending on how your rights are setup will then determine how successful my command is.  The configuration of your UI will then determine what data if any is returned back to me through the UI.  If I find that I have rights to do stuff and I can get the UI to return data to my I can then have the SQL Server return me your customer data, or if I’m feeling like causing you some headaches I can drop all your tables by using the correctly crafted code.

The best way to not be open to these attacks is to use parametrized code on the front end.  This allows you to pass in your variables but without leaving the site open to attack as a T/SQL command isn’t passed directly to the SQL Server.  This is done using the technique shown here.

Dim v_ConnString As String
v_ConnString = "Server=servername;Database=AdventureWorks;uid=username;pwd=password;"
Dim v_Connection As New SqlConnection(v_ConnString)

Dim v_Command As New SqlCommand("sp_helpdb", v_Connection)
v_Command.Parameters.Add("var1")
v_Command.Parameters("var1").Value = request.form("Var1")
v_Command.CommandType = CommandType.StoredProcedure
v_Connection.Open()

If your developers insist on using the first method all is not lost.  There is still a method which can be used to help protect the database.  Each value which is returned from the end user has to be validated.  Any single quote values should be replaced with two single quote values.  This way anything which the malicious code executor tries to push through is turned back into a string value.  Also all inputs regardless of data type should be surrounded by single quotes.  If this isn’t done then when passing in a number value the attacker can easily run there own code without having to get caught by the single quote check.

dim v_ado, v_sql, v_conn
v_conn =
"Provider=sqloledb;Data Source=sqlserver;Initial Catalog=AdventureWorks;User ID=UserName;Password=password;"
v_sql = "exec usp_Something @var1=" & request.form("Var1") & ", @var2='" & request.form("Var1")
set v_ado = CreateObject("ADODB.RecordSet")

on error resume next
v_ado.open v_sql, v_conn, 3, 3
...

If I then set the value of the Var1 form element to “6; select * from sys.tables” I will be able to query all the tables and my code won’t ever be caught by the single quote check.

Unforunately once the data has been submitted to the database, there isn’t much which can be done within SQL to decide is there is malicious code within the string.  Mostly because your stored procedure probably won’t ever see the malicious code.  The attacker will terminate the stored procedure and run their command after (using the technique I showed above) so any potential checking that you did within the procedure won’t do anything.

These techniques are not foolproof.  The best protection method against an injection attack is to parametrized code method shown above.  No matter what the attacker places in the form fields it won’t be executed as the engine (ASP, .NET, JSP, PHP, etc) will simply take the value and add it as a parameter to the procedure no matter how the attacker tries to escape out to run there own code.  Protection the database from an injection attack is a shared responsibility between the DBA and the front end developer.  Neither person or team should try to put the entire responsibility on the other team.  DBAs need to make sure that object rights are as minimal as possible, and front end developers need to make sure that the commands never get sent to the database in the first place.

Have you seen any other techniques which can be used to protect the database.  Share them in the comments below.

Denny


December 19, 2007  8:00 AM

Microsoft SoCal Code Camp Comming Up



Posted by: Denny Cherry
Development, In Person Events, SQL, SQL Server 2008

The Microsoft Code camp is coming upon us shortly.

I’ve submitted three sessions (so far) to the organizers of the code camp.  They’ve told me that odds are I’ll be able to do all three sessions.  I’ve tagged all my sessions with SQL Server to make them easier to find.

For those in Southern California (or anyone who’s looking for an excuse to come to lovely Fullerton Community college during January) check out the site http://www.socalcodecamp.com/.

 For those looking for an excuse to come from out of the area, while the Midwest and back east a buried in snow and cold in January, Southern California is usually somewhere in the 60s or 70s.

See you there.

Denny


December 17, 2007  8:00 AM

Microsoft Announces SQL 2008 Launch Event



Posted by: Denny Cherry
In Person Events, SQL Server 2008

Microsoft has announced the schedule for the SQL 2008 launch event.

 You can register here.

The official launch is being held in Los Angeles this time around.  I’ll be there, hopefully you will there as well.

 Denny


December 15, 2007  5:38 AM

New Article: SQL Server encryption vs. hashing for data security



Posted by: Denny Cherry
Article, Data Types, Development, SQL, SQL Server 2005, SQL Server 2008, T/SQL

I’ve published a new tip over on SearchSQLServer.com entitled “SQL Server encryption vs. hashing for data security“.

(It was published a couple of days ago, I just didn’t get notice until today.)

Denny


December 13, 2007  8:00 AM

So what’s the difference between null and a blank field



Posted by: Denny Cherry
Data integrity, Data Types, NULL, SQL, T/SQL

There is always a discussion going on somewhere over having NULL values in a database.  Some people (such as myself) are fine with NULL values.  Some people don’t like having NULL values in there database for some reason.  I’ve heard all sorts of reasons; “The screw up performance”, “They are hard to deal with”, “I like using a default value better”, “Default values are easier to work with”.

The performance one is just bunk.  NULL values are indexed just like any other value.

The rest of the reasons to not use NULL values ignore the fact that a NULL value is a perfectly legit value for some fields.  Not all fields should allow a NULL value to be placed in them.  If you are building a shopping cart and have a line items table, the Quantity and Price fields probably shouldn’t allow nulls as you should know the price and quantity when creating the invoice.  However in the same shopping card application the users phone number may need to allow NULL values.

Something which is key to remember about NULL values is that NULL doesn’t mean nothing.  It means unknown.  There may be a value, there may not, we simply do not know what that value is.  Putting in some default value will then put a value on the field when there shouldn’t be.  A great example which I read recently involved a hypothetical database at a blood bank.  A person comes in needing blood.  They are entered into the system but the recipients blood type isn’t known.  If you put in a default value of a blank value when you go to query the supply table, if there are pints of blood which have not been typed they will also have a blank value (as that is your standard default value).  You’ll get a match when you run the query and possibly be giving the recipient blood which is of the wrong type.  Now if you were using NULL values in the database the query would return no matching values as NULL <> NULL (under the default settings of Microsoft SQL Server).  Granted we should be able to trust that the person give the recipient the blood would check the typing, but would you rather trust a person or the computer.  People are overworked and usually underpaid, and nurses and doctors make simple mistakes just like the rest of us.  Personally I’d rather know that the system wouldn’t find a match rather than have to worry that I’d be getting the wrong type of blood.

If you really want to display that default value when the data is returned that’s what the ISNULL system function is for.  This way you don’t have to write long case statements around each field which allows NULL.

NULL values have there place, just like data does.  NULL values don’t mean that we don’t care about the data, it means that we simply don’t know the value of the data as of yet.  It shows that we care about the system and the people using it.  How annoying is it to look at a form which is full of default values and having to read through all of them instead of simply looking at the ones which are blank.


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: