SQL Server with Mr. Denny


January 7, 2008  8:00 AM

XML in the database. What’s the big deal any way?

Denny Cherry Denny Cherry Profile: Denny Cherry

So, we’ve had the ability to use XML within the SQL Server databases since SQL 2000 came out.  What’s the big deal you ask?  How can I use it to make my life easier you also ask?

When I first started looking into this whole XML thing I was thinking the same thing.  How is this blob of data going to make my life easier?  Well when used correctly it most certenly can.  In SQL Server (up through SQL 2005 at least) we don’t have a way to pass more than one record into a stored procedure.  If you have an order entry system and the custom wants to order 3 items you have to run three commands on the SQL Server, doing three inserts one at a time.  If you instead passed those three items from the UI to the database as an XML document you could then process all three items in a single command thereby improving performance.

declare @XML XMLSET @XML = '<ROOT><Customer ID="12">
<Order ID="7498">
<Item ID="415" Qty="1" />
<Item ID="87547" Qty="2" />
<Item ID="4125" Qty="1" />
</Order>
</Customer>
</ROOT>'
DECLARE @hDoc INT
exec sp_xml_preparedocument @hDoc OUTPUT, @XML
SELECT *
FROM OPENXML (@hDoc, '//Customer/Order/Item')
WITH (CustomerId INT '../../@ID',
OrderId INT '../@ID',
ItemId INT '@ID',
Qty INT '@Qty')
exec sp_xml_removedocument @hDoc

As you can see by the 0utput we have a table which we can process in bulk.  Instead of running a sample piece of code if we were to build this into a stored procedure we could do something like this.

Here is our LineItems table.

CREATE TABLE LineItems
(OrderId INT,

ItemId INT,
Qty INT,
ItemPrice NUMERIC(6,2))

And here is our procedure which adds the XML data to the LineItems table.  In a production system business logic would need to be added to ensure stock on hand, and to return estimated ship dates to the client.
CREATE PROCEDURE usp_AddLineItems
@Items XML
AS
DECLARE @hDoc INT

exec sp_xml_preparedocument @hDoc OUTPUT, @Items


INSERT INTO LineItems
SELECT OrderId, ItemId, Qty, ItemPrice
FROM OPENXML (@hDoc, '//Customer/Order/Item')
WITH (CustomerId INT '../../@ID',
OrderId INT '../@ID',
ItemId INT '@ID',
Qty INT '@Qty',
ItemPrice NUMERIC(6,2)
'@ItemPrice')
exec sp_xml_removedocument @hDocGO

Since we don’t want to create a web based order system for this little demo here we can run the procedure and verify the output.

declare @XML XMLSET @XML = '<ROOT><Customer ID="12">
<Order ID="7498">
<Item ID="415" Qty="1" ItemPrice="12.95"/>
<Item ID="87547" Qty="2" ItemPrice="16.25"/>
<Item ID="4125" Qty="1" ItemPrice="8.25"/>
</Order>
</Customer>
</ROOT>'
EXEC usp_AddLineItems @XMLGO
SELECT *
FROM LineItemsGO

As we can see this is much more efficient than having to run three insert statements as they would require at least three rights to the disk, plus how ever many reads are needed to do the business logic. By using the XML we can do it all with a single write to the disk, and a single set of reads for the business logic. In this example we have reduced our disk activity by 33%. If we were processing 10 line items we would have reduced our disk activity 90%. And since disk performance is usually the bottleneck of any database reducing the disk IO any way we can is a good thing.

Denny

January 3, 2008  8:00 AM

Using a foreach loop to process an XML document.

Denny Cherry Denny Cherry Profile: Denny Cherry

Receintly I was working on a project where I needed to use a foreach loop with an SSIS project, but couldn’t for the life of me get it to properly process the XML document which I was giving it.  Well with some major work and digging I was able to get it working correctly, but it took me for ever to get all the little setting correct so I figured that I’d throw the info up here for anyone else who is looking for it.

Some background on what the process is that I’m working on.  Basically I’ve got a table with catagorized data in it.  I need to export all the data from the table info one file per catagory (don’t ask, I didn’t design it, I’ve just got to automate it; and it actually makes sence in the grant schem of things).  Well I figured that the easiest way to do this was to use a foreach loop and give it an XML document with the list of catagories to process.  (This was better than looping through and getting the next value from the database.)

 So needless to say, I get started on my little process.

The query which I’m using within an Execute SQL Task is below.  The Execute SQL Task puts the XML data into an SSIS variable called v_CategoryList with a data type of string.

SELECT WebsiteBlockCategoryId as id
FROM dbo.WebsiteBlockCategory cat with (nolock)
where WebsiteBlockCategoryId  <> 0
for XML AUTO

The XML document looks like this. (SSIS seams to be wrapping it within <ROOT></ROOT> tags for me which is why I’m not doing it my self.)  My XML document is actually must longer than this, but you get the idea.  It’s a very basic XML document.

<ROOT>
 <cat id=”18″ />
 <cat id=”19″ />
 <cat id=”20″ />
 <cat id=”21″ />
 <cat id=”22″ />
 <cat id=”23″ />
 <cat id=”24″ />
 <cat id=”25″ />
 <cat id=”26″ />
 <cat id=”27″ />
 <cat id=”28″ />
</ROOT>

As you can see on the screenshot of the forloop properties I’ve set the source to be my variable and the EnumerationType to Element Collection.  Since I know where the data is within the XML document I use DirectInput for both the outer and Inner XPath strings.  For the Outer XPath string I’m using “//cat”.  Because I’m not putting in the ROOT level name it doesn’t matter what gets put in there as long as there is a parent level.    For the Inner XPath string I’ve got the Element name with an @ sign in front of it “@id”. 

Collection Screen

Over on the Variables page of the UI I’ve got my v_CategoryId variable mapped to Index 0 of my document.  If you need to output more than one element from your XML document set your Inner XPath to “*”.  This “should” allow you to bring back all the elements and refer to them by index number starting with 0.  I havn’t actually tried this, as I’ve always only needed a single element hense the “should”.

Variable Mappings Screen

And to think, that little thing took me a couple of days to get working right.  I can only imagine trying to do this in SQL 2000 and DTS.

Denny


December 31, 2007  8:00 AM

Have a Happy New Year

Denny Cherry Denny Cherry Profile: Denny Cherry

This will by the last blog entry for the year (this being new years eve and all). 

I hope that everyone has a happy new year and that no one needs to have to work tonight.

Don’t forget to spend tonight with family and friends, and not your computer working.

Denny


December 27, 2007  8:00 AM

Avoiding SQL Injection Attacks

Denny Cherry Denny Cherry Profile: Denny Cherry

The most common way for people to insert invalid data or cause damage to your database is through what is called a SQL Injection Attack.  This is when malicious code is placed within the responses which are expected and that code is not caught and it instead executed.  Depending on what level of access to the SQL Server your application account has will determine how successful the injection attack will be.

There are ways to protect your self.  The first and most important is that the account which the application uses to connect to the database should have the least set of rights necessary to do it’s job.  Don’t make the account dbo or sysadmin just so that everything works correctly without you having to modify the rights when you add objects.  The second requires your front end developers (the ASP, .NET, etc folks) to understand the potential problem so that they code the application in such a manor that the risk is reduced or removed.

(Please keep in mind that I’m not an application coder so my application code shown below may not be perfect.)

If they use in-line T/SQL commands such as this:

dim v_ado, v_sql, v_conn
v_conn = "Provider=sqloledb;Data Source=sqlserver;Initial Catalog=AdventureWorks;User ID=UserName;Password=password;"
v_sql = "exec usp_Something @var1='" & request.form("Var1") & "', @var2='" & request.form("Var1")
set v_ado = CreateObject("ADODB.RecordSet")
on error resume next
v_ado.open v_sql, v_conn, 3, 3
...

Then you are open to an injection attack.  This is because I can simply put “‘; SELECT * FROM sys.tables” as one of the form fields which I am submitting back to you and my SELECT statement will be executed against your SQL Server.  Depending on how your rights are setup will then determine how successful my command is.  The configuration of your UI will then determine what data if any is returned back to me through the UI.  If I find that I have rights to do stuff and I can get the UI to return data to my I can then have the SQL Server return me your customer data, or if I’m feeling like causing you some headaches I can drop all your tables by using the correctly crafted code.

The best way to not be open to these attacks is to use parametrized code on the front end.  This allows you to pass in your variables but without leaving the site open to attack as a T/SQL command isn’t passed directly to the SQL Server.  This is done using the technique shown here.

Dim v_ConnString As String
v_ConnString = "Server=servername;Database=AdventureWorks;uid=username;pwd=password;"
Dim v_Connection As New SqlConnection(v_ConnString)

Dim v_Command As New SqlCommand("sp_helpdb", v_Connection)
v_Command.Parameters.Add("var1")
v_Command.Parameters("var1").Value = request.form("Var1")
v_Command.CommandType = CommandType.StoredProcedure
v_Connection.Open()

If your developers insist on using the first method all is not lost.  There is still a method which can be used to help protect the database.  Each value which is returned from the end user has to be validated.  Any single quote values should be replaced with two single quote values.  This way anything which the malicious code executor tries to push through is turned back into a string value.  Also all inputs regardless of data type should be surrounded by single quotes.  If this isn’t done then when passing in a number value the attacker can easily run there own code without having to get caught by the single quote check.

dim v_ado, v_sql, v_conn
v_conn =
"Provider=sqloledb;Data Source=sqlserver;Initial Catalog=AdventureWorks;User ID=UserName;Password=password;"
v_sql = "exec usp_Something @var1=" & request.form("Var1") & ", @var2='" & request.form("Var1")
set v_ado = CreateObject("ADODB.RecordSet")

on error resume next
v_ado.open v_sql, v_conn, 3, 3
...

If I then set the value of the Var1 form element to “6; select * from sys.tables” I will be able to query all the tables and my code won’t ever be caught by the single quote check.

Unforunately once the data has been submitted to the database, there isn’t much which can be done within SQL to decide is there is malicious code within the string.  Mostly because your stored procedure probably won’t ever see the malicious code.  The attacker will terminate the stored procedure and run their command after (using the technique I showed above) so any potential checking that you did within the procedure won’t do anything.

These techniques are not foolproof.  The best protection method against an injection attack is to parametrized code method shown above.  No matter what the attacker places in the form fields it won’t be executed as the engine (ASP, .NET, JSP, PHP, etc) will simply take the value and add it as a parameter to the procedure no matter how the attacker tries to escape out to run there own code.  Protection the database from an injection attack is a shared responsibility between the DBA and the front end developer.  Neither person or team should try to put the entire responsibility on the other team.  DBAs need to make sure that object rights are as minimal as possible, and front end developers need to make sure that the commands never get sent to the database in the first place.

Have you seen any other techniques which can be used to protect the database.  Share them in the comments below.

Denny


December 19, 2007  8:00 AM

Microsoft SoCal Code Camp Comming Up

Denny Cherry Denny Cherry Profile: Denny Cherry

The Microsoft Code camp is coming upon us shortly.

I’ve submitted three sessions (so far) to the organizers of the code camp.  They’ve told me that odds are I’ll be able to do all three sessions.  I’ve tagged all my sessions with SQL Server to make them easier to find.

For those in Southern California (or anyone who’s looking for an excuse to come to lovely Fullerton Community college during January) check out the site http://www.socalcodecamp.com/.

 For those looking for an excuse to come from out of the area, while the Midwest and back east a buried in snow and cold in January, Southern California is usually somewhere in the 60s or 70s.

See you there.

Denny


December 17, 2007  8:00 AM

Microsoft Announces SQL 2008 Launch Event

Denny Cherry Denny Cherry Profile: Denny Cherry

Microsoft has announced the schedule for the SQL 2008 launch event.

 You can register here.

The official launch is being held in Los Angeles this time around.  I’ll be there, hopefully you will there as well.

 Denny


December 15, 2007  5:38 AM

New Article: SQL Server encryption vs. hashing for data security

Denny Cherry Denny Cherry Profile: Denny Cherry

I’ve published a new tip over on SearchSQLServer.com entitled “SQL Server encryption vs. hashing for data security“.

(It was published a couple of days ago, I just didn’t get notice until today.)

Denny


December 13, 2007  8:00 AM

So what’s the difference between null and a blank field

Denny Cherry Denny Cherry Profile: Denny Cherry

There is always a discussion going on somewhere over having NULL values in a database.  Some people (such as myself) are fine with NULL values.  Some people don’t like having NULL values in there database for some reason.  I’ve heard all sorts of reasons; “The screw up performance”, “They are hard to deal with”, “I like using a default value better”, “Default values are easier to work with”.

The performance one is just bunk.  NULL values are indexed just like any other value.

The rest of the reasons to not use NULL values ignore the fact that a NULL value is a perfectly legit value for some fields.  Not all fields should allow a NULL value to be placed in them.  If you are building a shopping cart and have a line items table, the Quantity and Price fields probably shouldn’t allow nulls as you should know the price and quantity when creating the invoice.  However in the same shopping card application the users phone number may need to allow NULL values.

Something which is key to remember about NULL values is that NULL doesn’t mean nothing.  It means unknown.  There may be a value, there may not, we simply do not know what that value is.  Putting in some default value will then put a value on the field when there shouldn’t be.  A great example which I read recently involved a hypothetical database at a blood bank.  A person comes in needing blood.  They are entered into the system but the recipients blood type isn’t known.  If you put in a default value of a blank value when you go to query the supply table, if there are pints of blood which have not been typed they will also have a blank value (as that is your standard default value).  You’ll get a match when you run the query and possibly be giving the recipient blood which is of the wrong type.  Now if you were using NULL values in the database the query would return no matching values as NULL <> NULL (under the default settings of Microsoft SQL Server).  Granted we should be able to trust that the person give the recipient the blood would check the typing, but would you rather trust a person or the computer.  People are overworked and usually underpaid, and nurses and doctors make simple mistakes just like the rest of us.  Personally I’d rather know that the system wouldn’t find a match rather than have to worry that I’d be getting the wrong type of blood.

If you really want to display that default value when the data is returned that’s what the ISNULL system function is for.  This way you don’t have to write long case statements around each field which allows NULL.

NULL values have there place, just like data does.  NULL values don’t mean that we don’t care about the data, it means that we simply don’t know the value of the data as of yet.  It shows that we care about the system and the people using it.  How annoying is it to look at a form which is full of default values and having to read through all of them instead of simply looking at the ones which are blank.


December 10, 2007  8:00 AM

Temp Tables, Table Variables, and CTEs

Denny Cherry Denny Cherry Profile: Denny Cherry
i5

There are some major differences between temp tables, table variables and common table expressions (CTEs).  Some of the big differences are:

Temp Tables vs. Table Variables

  1. SQL Server does not place locks on table variables when the table variables are used.
  2. Temp tables allow for multiple indexes to be created
  3. Table variables allow a single index the Primary Key to be created when the table variable is declared only.  There is an exception to this, that if you can create the index inline, for example by creating a unique constraint inline as shown in the comments.  However these indexes (and the table variable in general) will always be assumed to have 1 row in them, no matter how much data is within the table variable.
  4. Temp tables can be created locally (#TableName) or globally (##TableName)
  5. Table variables are destroyed as the batch is completed.
  6. Temp tables can be used throughout multiple batches.
  7. Temp tables can be used to hold the output of a stored procedure (temp tables will get this functionality in SQL Server 2008).

Table variables and Temp Tables vs. CTEs

  1. CTEs are used after the command which creates them.
  2. CTEs can be recursive within a single command (be careful because they can cause an infinite loop).
  3. Table variables and Temp Tables can be used throughout the batch.
  4. The command before the CTE must end with a semi-colon (;).
  5. As Temp tables and table variables are tables you can insert, update and delete the data within the table.
  6. CTEs can not have any indexes created on them, source tables much have indexes created on them.

If you can think of anything that I’ve missed, feel free to post them in the comments.

Denny


December 5, 2007  8:00 AM

A well timed code freeze is critical to keeping your sanity.

Denny Cherry Denny Cherry Profile: Denny Cherry

Doing a code freeze at this time of year can be crucial to keeping your sanity this time of year.  Often times the business unit likes to push out last minute changes before the year end.  This often means pushing last minute code (which as everyone knows isn’t always perfect) which can lead to unpleasant little phone calls from the business or the NOC in the middle of holiday festivities.

Do your self and your family a favor, push for a code freeze until after the new year.  The developers will thank you, the business with hate you (but really what else is new).  I like about a three week code freeze (emergency bug fixes are except, but they have to actually be bug fixes, and they have to be an emergency) starting about December 15 ending about January 3 (depending on when the weekends fall).

It can be tough to get a code freeze implemented at first.  But after the first year with the code freeze everyone will want to go it the next year.

Denny


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: