So I’ve managed to trick the kind folks of the SQL PASS program committee once again this year. I’ve gotten two sessions accepted for the summit.
The first is a “regular session” which is the normal 75 minute community session during which I’ll be presenting “Where should I be encrypting my data” during which I’ll be talking about all the various ways to encrypt data within the SQL Server database. These techniques will for the most part work on any version of SQL Server from SQL 6.5 all the way through SQL Server “Denali” (which we’ll hopefully know the name of before the summit).
The second session is one of the new half day sessions where I’ll be presenting with the lovely, brilliant and highly talented Stacia Misner called “So how does the BI workload impact the database engine?”. During this session we’ll be looking at a variety of things including how ETL extracts and loads actually impact the databases they touch, and why running queries from SSAS and a data warehouse are faster than running them from the OLTP application. This session won’t be so much a BI session on how to do BI tasks, but how those BI tasks function under to the hood of the core SQL Server engine.
See you at the summit.
One of the features which has been announced to be coming in SQL Server “Denali” is the “Contained Database” feature. The feature which I’m looking for the most from Contained Databases is the ability to create a user within the database without having to first create a login for the user. This will make database consolidation and migration projects so much simpler in the future as you won’t have to first create the user in the destination instance with the same SID, or risk having to resync up the logins and users using the sp_change_users_login system stored procedure.
However, there is a little catch with having a lot of contained databases, using contained authentication on a single server, especially if the auto close flag is enabled like it often is on hosting companies servers. If you are using contained databases, and a user attempts to log into the contained database, but has the wrong password, the database must be opened the password checked, then the database closed. If this was to begin happening to a large number of contained databases the SQL Server could end up crashing itself as it’s trying to open and close all these databases. The reason that I see this happening on hosting company servers more than anywhere else, is because hosting companies put lots, and I mean lots, of databases on a single SQL Server instance. If that server was exposed to the Internet (which they often are so their customers can log into the server via Management Studio) then this becomes an even bigger problem.
Basically what I’m trying to say here is if you have a lot of databases on the server, and you use the auto-close flag on the databases to keep databases that aren’t being used from taking any memory, you’ll need to change this practice before you start deploying contained databases on SQL Server “Denali” when it releases.
I know that Microsoft’s Tech Ed 2011 ended a couple of weeks ago, and that my recap is way late, but better late than never. It’s been a crazy few weeks with EMC World 2011, Tech Ed 2011 the next week, then a two day train trip to Seattle which was supposed to have WiFi but didn’t, then SQL Cruise, and as I’m writing this I’m in New York for a week working.
Anyway, back to Tech Ed 2011; which was a blast. This was my first year (hopefully of many) speaking at Microsoft’s Tech Ed conference. If you haven’t heard of Tech Ed before, it is Microsoft’s premier IT Pro conference. While there are some developer tools covered the conference is mostly about the sysadmin side of the house which is ok seeing as how the dev side of the house has MIX and PDC to name just a couple. Needless to say I was honored that Microsoft asked me to present a session on SQL Server “Denali”, specifically on the new Manageability features of “Denali”. The only problem was that there weren’t a whole lot of earth shattering manageability changes being rolled out in “Denali” (or if there are we don’t know about them yet), I think I made the session work ok though going over what is coming out, as well as some other new features that while not related to manageability are very cool.
This was my first time being a booth babe (I looked damn hot in that SQL Server mini-skirt, you should have been there to see it), which was a requirement of getting a session. I’ve got to say, it was pretty cool talking to all the attendees and working through their problems. I know that we got a few peoples problems solved right there on the middle of the show floor. There were some interesting questions, some questions that I was surprised that were asked, and there were some damn hard questions. Fortunately there were a ton of really smart people always working the booth that were able to field the questions. In true SQL Server community fashion no one had a problem telling the person asking the question, I’ve got no idea lets get the guy that does. He happens to be right over here just a few feet away.
Outside of the exhibit hall and convention center I had some great conversations about SQL Server, community, work, etc. with some really great people. I had lots of fun hanging out and talking with Ed Hickey from the SQL Server product team, and I had a blast talking to Adrian Bethune also from the SQL Server product team (he’s the new Data Tier Applications PM so we’ll be having some fun with him later). It’s always a blast to see everyone at these large events. Atlanta was a really great place for SQL Server to have Tech Ed because there are just so many people in the Atlanta area that are involved in the SQL Server community and happen to be friends of mine like Geoff Hiten (twitter), Audrey Hammonds (blog | twitter), Julie Smith (blog | twitter), and Aaron Nelson (blog | twitter). I know that I’m missing some locals from this list, everyone was a blast to see as always.
As for my session, I thought it went pretty well. Especially when someone came down and found me at the SQL Server booth later that day to tell me that my session was the best one that he had seen all week so far. So to the random person, hopefully you read this, and Thank You because people doing that is why speakers are up there doing these presentations. As long as just one person gets something useful from the session I consider my job done. I’ve looked at my scores a little, and I landed right in the middle of the pack with a 4 something out of 5. All in all not to shabby. Hopefully Microsoft will see it the same way and invite me back next year to Tech Ed 2012 in Orlando, FL.
If you came to my session at Tech Ed thanks, if not I believe that they are all posted on the Tech Ed website for all to see.
See you at the next conference,
All to often developers need to force some locks on a table so that they can be sure that the records aren’t going to change between the time that they first look at the records and when the transaction is completed. The most common method that I’ve seen to do this involves at the top of the transaction running a select statement against the table, with the UPDLOCK or XLOCK which forces the database engine to take higher locks than it normally would against the table. While this does have the desired end result of locking the table, is causes a lot of unneeded IO to get generated, and takes a lot more time than is needed.
For example, lets assume that we want to lock the Sales.SalesOrderHeader table in the AdventureWorks database so that we can do some processing on it without allowing anyone else to access the table. If we were to issue a SELECT COUNT(*) FROM Sales.Individual WITH (XLOCK) against the database we lock the table as requested, however it generates 3106 physical reads against the database as we can see below in the output from the Messages tab.
SET STATISTICS IO ON
FROM Sales.Individual WITH (TABLOCK)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(1 row(s) affected
Table ‘Individual’. Scan count 1, logical reads 3090, physical reads 8, read-ahead reads 3098, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If we look at the sys.dm_tran_locks DMV we’ll now see that we have taken an exclusive lock against the table (don’t forget that you have to query the DMV within the transaction in order to see the lock). That is a lot of IO to generate in order to generate a single lock within the database engine. You can imagine what would happen if this was a much larger table, say a fact table within a data warehouse. A large multi-year fact table could end up generating millions of IO just to lock the table.
A better solution to this sort of problem would be the sp_getapplock system stored procedure. This procedure allows you to table table level locks without running queries against the table. It can lock tables with are Gigs in size in just a second. When we run the command telling it to lock the Sales.Individual table, we get no IO being generated and yet we still see the object being locked. In this case we would run the below command to generate the needed lock.
exec sp_getapplock @Resource=’Sales.Individual’, @LockMode=’Exclusive’
The only difference we should see in the output of the sys.dm_tran_locks DMV is that the value in the resource_type column has changed from OBJECT to APPLICATION. Once the lock has been taken against the database we can do all the processing that we want to against the table without having to worry about another user coming in and changing the base data of the table.
The sp_getapplock procedure must be run within an explicit transaction, and has several parameters so that you can control what it is doing.
The first parameter is @Resource which we used above. This parameter is how you tell the stored procedure what object you wish to lock. It accepts the input as schema.object or just the object if the object is within your default schema. It is recommended that you use the two part name to ensure that you are always locking the correct object.
The next parameter is @LockMode which we also used above. This parameter allows you to tell the database engine what locking level you used. Your options are "Shared, Update, IntentShared, IntentExclusive, and Exclusive”. Any other value specified will throw an error.
The third parameter is @LockOwner. This parameter allows you to tell the stored procedure to take the lock for the duration of the transaction (the default) or the duration of the session. To explicitly specify that you want to take the lock for the duration of the transaction specify the value of “Transaction”. To specify that you want to take the lock for the duration of the session specify the value of “Session”. When the value of “Session” is used the procedure does not need to be called within a transaction. If a value of “Transaction” or no value is specified then the procedure does need to be called within an explicitly defined transaction.
The fourth parameter is @LockTimeout. This parameter allows you to tell the procedure how many milliseconds to wait before returning an error when attempting to take the lock. If you want to procedure to return immediately then the specify a value of 0. The default value for this parameter is the same as the value returned by querying the @@LOCK_TIMEOUT system function.
The fifth and final parameter is @DbPrincipal. This parameter allows you to tell the procedure the name of the user, role or application role which has rights to the object. Honestly I haven’t really figured out what this parameter is used for. What I do know, is that if you specify a user, role or application role which doesn’t have rights to the object the procedure call will fail. This parameter defaults to the public role, if you get an error when using the default value create a role with no users in it, and grant the role rights to the object then specify the role within the parameter. No users need to be assigned to the role to make this work.
Releasing the lock that you’ve just taken can be done in a couple of different ways. The first is the easiest, commit the transaction using COMMIT (ROLLBACK will also release the lock, but you’ll loose everything that you’ve done). You can also use the sp_releaseapplock system stored procedure. The sp_releaseapplock procedure accepts three parameters which are @Resource, @LockOwner and @DbPrincipal. Simply set these values to the same values which you used when taking the lock and the lock will be release. The procedure sp_releaseapplock can only be used release locks which were taking by using the sp_getapplock procedure, it can not be used to release traditional locks that the database engine has taken naturally, and it can only be used to release locks which were created by the current session.
Hopefully some of this knowledge can help speed up your data processing times.
Besides the insanity? It’s fun. If it wasn’t, I wouldn’t do it. God knows it isn’t for the money.
Yeah the presentations can be a pain to come up with, and coming up with topics to present on is probably my least favorite part of doing all this. But sharing the information that I know, and being able to learn more through it thanks to my NDA is just awesome.
What really makes all this work worthwhile is when I get emails from people saying that the information that they learned from an article or session helped them with their job, or to fix a problem that they were having.
And that right there is why I do it.
When doing a database restore and you want to move the physical database files from one disk to another, or from one folder to another you need to know the logical file names. But if you can’t restore the database how do you get these logical file names? By using the RESTORE FILELISTONLY syntax of the restore command.
The syntax is very simple for this statement.
RESTORE FILELISTONLY FROM DISK=’D:\Path\To\Your\Backup\File.bak’
The record set which will be returned will give you the logical names, as well as the physical names of the database files which you can then use within the RESTORE DATABASE command.
If you have 12 disks to hold DB Data files and decide to use RAID10, would you create 1 RAID 10 group or 2 RAID 10 groups made of 6 disks each for best read/write performance?
I would probably make 2 RAID 10 RAID groups one for the data files, and one for the transaction log. Without knowing what percentage of data access will be read and what will be write I’m just guessing here. Depending on the load RAID 5 may work just fine for the data files.
Running on EMC Clarion CX4 – Windows reports disks are not aligned, SAN admin says that because of caching, the partition alignment from windows does not matter (and SAN is setup per "best practices". Is this true?
Caching has nothing to do with disk alignment. It sounds like your sysadmin should have gone to my SQL PASS pre-con. All the caching does is accept the writes from the host into D-RAM instead of writing to the disk directly.
Now if the LUN is aligned on the array by setting the offset on the array side (which isn’t recommended as it makes LUN migrations within the array more difficult) then you want to leave them misaligned in Windows. If however they are setup with a 0 offset within the array (which is the default) then they need to be aligned within Windows.
Yes, provided that your transaction logs have their own LUN from the other files as the write cache is enabled and disabled at the LUN level. By default read and write cache will be enabled for every LUN which is created on the array.
There aren’t to many cases where you would want to disable the write cache on a LUN except for maybe a data warehouse LUN where no data is updated, only new rows are written. The reason for this is that these will be sequential writes, and the array will bypass the write cache when it detects that sequential writes are being done as these sequential writes can be done directly to disk about as quickly as they can be done to cache as once head gets into the correct place the writes are put done very quickly as the head and the spindle don’t need to move very far between each write operation.
A question that comes up when building a new virtual SQL Server is how should the disks be laid out when using the default VMDK (VMware) or vDisks (Hyper-V)? Should the disks be on a single LUN, or different LUNs, etc.
I’m sure that it will surprise no one when I say that it depends. On a virtual database server where the disk IO load is high you will want to separate the virtual disks out just like you would in the physical world. If the virtual database server has low or minimal IO then like in the physical world it may be ok to put the virtual disks on the same LUN.
It is important to look not just at the virtual machines disk load, but at the load of the other virtual machines which will be sharing the LUN(s) as well as what those other servers disks are doing. If you have the logs from one server on a LUN you don’t want to put the data files from another virtual SQL Server onto that LUN as you’ll have disk performance issues to contend with. For virtual database servers which have very high IO requirements you will want to dedicate LUNs for each of the virtual disks, assuming that you don’t use iSCSI or Raw Device Mappings (VMware) / Pass Through Disks (Hyper-V), just like you would in the physical world.
Hopefully this helps clear some stuff up.