All to often developers need to force some locks on a table so that they can be sure that the records aren’t going to change between the time that they first look at the records and when the transaction is completed. The most common method that I’ve seen to do this involves at the top of the transaction running a select statement against the table, with the UPDLOCK or XLOCK which forces the database engine to take higher locks than it normally would against the table. While this does have the desired end result of locking the table, is causes a lot of unneeded IO to get generated, and takes a lot more time than is needed.
For example, lets assume that we want to lock the Sales.SalesOrderHeader table in the AdventureWorks database so that we can do some processing on it without allowing anyone else to access the table. If we were to issue a SELECT COUNT(*) FROM Sales.Individual WITH (XLOCK) against the database we lock the table as requested, however it generates 3106 physical reads against the database as we can see below in the output from the Messages tab.
SET STATISTICS IO ON
FROM Sales.Individual WITH (TABLOCK)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
(1 row(s) affected
Table ‘Individual’. Scan count 1, logical reads 3090, physical reads 8, read-ahead reads 3098, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If we look at the sys.dm_tran_locks DMV we’ll now see that we have taken an exclusive lock against the table (don’t forget that you have to query the DMV within the transaction in order to see the lock). That is a lot of IO to generate in order to generate a single lock within the database engine. You can imagine what would happen if this was a much larger table, say a fact table within a data warehouse. A large multi-year fact table could end up generating millions of IO just to lock the table.
A better solution to this sort of problem would be the sp_getapplock system stored procedure. This procedure allows you to table table level locks without running queries against the table. It can lock tables with are Gigs in size in just a second. When we run the command telling it to lock the Sales.Individual table, we get no IO being generated and yet we still see the object being locked. In this case we would run the below command to generate the needed lock.
exec sp_getapplock @Resource=’Sales.Individual’, @LockMode=’Exclusive’
The only difference we should see in the output of the sys.dm_tran_locks DMV is that the value in the resource_type column has changed from OBJECT to APPLICATION. Once the lock has been taken against the database we can do all the processing that we want to against the table without having to worry about another user coming in and changing the base data of the table.
The sp_getapplock procedure must be run within an explicit transaction, and has several parameters so that you can control what it is doing.
The first parameter is @Resource which we used above. This parameter is how you tell the stored procedure what object you wish to lock. It accepts the input as schema.object or just the object if the object is within your default schema. It is recommended that you use the two part name to ensure that you are always locking the correct object.
The next parameter is @LockMode which we also used above. This parameter allows you to tell the database engine what locking level you used. Your options are "Shared, Update, IntentShared, IntentExclusive, and Exclusive”. Any other value specified will throw an error.
The third parameter is @LockOwner. This parameter allows you to tell the stored procedure to take the lock for the duration of the transaction (the default) or the duration of the session. To explicitly specify that you want to take the lock for the duration of the transaction specify the value of “Transaction”. To specify that you want to take the lock for the duration of the session specify the value of “Session”. When the value of “Session” is used the procedure does not need to be called within a transaction. If a value of “Transaction” or no value is specified then the procedure does need to be called within an explicitly defined transaction.
The fourth parameter is @LockTimeout. This parameter allows you to tell the procedure how many milliseconds to wait before returning an error when attempting to take the lock. If you want to procedure to return immediately then the specify a value of 0. The default value for this parameter is the same as the value returned by querying the @@LOCK_TIMEOUT system function.
The fifth and final parameter is @DbPrincipal. This parameter allows you to tell the procedure the name of the user, role or application role which has rights to the object. Honestly I haven’t really figured out what this parameter is used for. What I do know, is that if you specify a user, role or application role which doesn’t have rights to the object the procedure call will fail. This parameter defaults to the public role, if you get an error when using the default value create a role with no users in it, and grant the role rights to the object then specify the role within the parameter. No users need to be assigned to the role to make this work.
Releasing the lock that you’ve just taken can be done in a couple of different ways. The first is the easiest, commit the transaction using COMMIT (ROLLBACK will also release the lock, but you’ll loose everything that you’ve done). You can also use the sp_releaseapplock system stored procedure. The sp_releaseapplock procedure accepts three parameters which are @Resource, @LockOwner and @DbPrincipal. Simply set these values to the same values which you used when taking the lock and the lock will be release. The procedure sp_releaseapplock can only be used release locks which were taking by using the sp_getapplock procedure, it can not be used to release traditional locks that the database engine has taken naturally, and it can only be used to release locks which were created by the current session.
Hopefully some of this knowledge can help speed up your data processing times.
Besides the insanity? It’s fun. If it wasn’t, I wouldn’t do it. God knows it isn’t for the money.
Yeah the presentations can be a pain to come up with, and coming up with topics to present on is probably my least favorite part of doing all this. But sharing the information that I know, and being able to learn more through it thanks to my NDA is just awesome.
What really makes all this work worthwhile is when I get emails from people saying that the information that they learned from an article or session helped them with their job, or to fix a problem that they were having.
And that right there is why I do it.
When doing a database restore and you want to move the physical database files from one disk to another, or from one folder to another you need to know the logical file names. But if you can’t restore the database how do you get these logical file names? By using the RESTORE FILELISTONLY syntax of the restore command.
The syntax is very simple for this statement.
RESTORE FILELISTONLY FROM DISK=’D:\Path\To\Your\Backup\File.bak’
The record set which will be returned will give you the logical names, as well as the physical names of the database files which you can then use within the RESTORE DATABASE command.
If you have 12 disks to hold DB Data files and decide to use RAID10, would you create 1 RAID 10 group or 2 RAID 10 groups made of 6 disks each for best read/write performance?
I would probably make 2 RAID 10 RAID groups one for the data files, and one for the transaction log. Without knowing what percentage of data access will be read and what will be write I’m just guessing here. Depending on the load RAID 5 may work just fine for the data files.
Running on EMC Clarion CX4 – Windows reports disks are not aligned, SAN admin says that because of caching, the partition alignment from windows does not matter (and SAN is setup per "best practices". Is this true?
Caching has nothing to do with disk alignment. It sounds like your sysadmin should have gone to my SQL PASS pre-con. All the caching does is accept the writes from the host into D-RAM instead of writing to the disk directly.
Now if the LUN is aligned on the array by setting the offset on the array side (which isn’t recommended as it makes LUN migrations within the array more difficult) then you want to leave them misaligned in Windows. If however they are setup with a 0 offset within the array (which is the default) then they need to be aligned within Windows.
Yes, provided that your transaction logs have their own LUN from the other files as the write cache is enabled and disabled at the LUN level. By default read and write cache will be enabled for every LUN which is created on the array.
There aren’t to many cases where you would want to disable the write cache on a LUN except for maybe a data warehouse LUN where no data is updated, only new rows are written. The reason for this is that these will be sequential writes, and the array will bypass the write cache when it detects that sequential writes are being done as these sequential writes can be done directly to disk about as quickly as they can be done to cache as once head gets into the correct place the writes are put done very quickly as the head and the spindle don’t need to move very far between each write operation.
A question that comes up when building a new virtual SQL Server is how should the disks be laid out when using the default VMDK (VMware) or vDisks (Hyper-V)? Should the disks be on a single LUN, or different LUNs, etc.
I’m sure that it will surprise no one when I say that it depends. On a virtual database server where the disk IO load is high you will want to separate the virtual disks out just like you would in the physical world. If the virtual database server has low or minimal IO then like in the physical world it may be ok to put the virtual disks on the same LUN.
It is important to look not just at the virtual machines disk load, but at the load of the other virtual machines which will be sharing the LUN(s) as well as what those other servers disks are doing. If you have the logs from one server on a LUN you don’t want to put the data files from another virtual SQL Server onto that LUN as you’ll have disk performance issues to contend with. For virtual database servers which have very high IO requirements you will want to dedicate LUNs for each of the virtual disks, assuming that you don’t use iSCSI or Raw Device Mappings (VMware) / Pass Through Disks (Hyper-V), just like you would in the physical world.
Hopefully this helps clear some stuff up.
If you have a SQL Server, and you are doing transaction log backups on the database server you’ll notice that 99% of the messages in your ERRORLOG file are worthless. All they say is that the log was backed up successfully. Personally I don’t care that every 15 minutes all 30 databases had their logs backed up without error. I only care about the errors or problems when doing the backups. I can easily enough write a script to query the msdb database and figure out when the last good backup was.
Fortunately there is a trace flag which can fix this little problem. This trace flag is 3226. Simply add -T3226 as a startup parameter for the SQL Server instance, and restart the instance, and the SQL Server will suppress the backup succeeded message from being put into the log. Any errors which are thrown by the backup processes will still be logged as expected. This makes the logs not only much easier to read, but also much, much smaller.
As far as I can tell this trace flag works on all versions from 2000 to “Denali”.
Many thanks to who ever it was at the PASS summit that pointed this little bit of awesomeness out to me.
I’m a couple days late getting this post out, thanks to being at EMC World, but here it is.
This year the PASS Summit is asking for your help in deciding what sessions should be presented at the SQL PASS summit in Seattle later this year. To do this they have created the Session Preference Tool which will allow you to mark the sessions which you think you would like to attend. The selection committees will be taking the numbers from this tool into account when making their selections.
Hopefully you’ll like the session abstracts that I’ve submitted and vote for them.
Yesterday was EMC World day 2 and it was another great day at the conference. I started the day a little late as I was out pretty late on Monday night at a party. I was abe to great some great sessions in during the day however.
The first session that I hit was “SQL Server on VMware – Architecting for Performance” which was a bit of a let down. The first half of the session was mostly a SQL Server consolidation 101 session, and a lot of the points the speaker talked about in solution design I didn’t agree with. Some examples include her recommendation to set max server memory at ~500 megs below the memory allocated to the VM. Personally I feel that the max server memory setting should be set about 2-4 gigs below the amount of memory allocated to the VM (depending on what other software is installed on the VM, how much SQLCLR is used, etc.). There were also recommendations to enable lock pages on all servers as well as to disable the ballon drivers which I didn’t agree with either.
The second session that I went two was “VNX Block Oriented Performance” which was a great session. During this 500 level session the speaker talked about the hardware layout of the new EMC VNX storage array, specifically exactly how much data can be pushed through each internal component of the VNX and VNXe storage arrays. I’m not going to put the numbers in this post as I want to get the deck downloaded from the EMC World website so i can double check all the numbers before I do. With all the info that the speaker was giving out there was no way I could type it all fast enough on my laptop, much less my iPad which I was using to take notes.
The third session that I went to had a crazy long title. In my notes I titled it as “Building a highly available enterprise data warehouse using a bunch of shit”. The session was all about using the EMC GreenPlum database to build a distributed data warehouse and levering some of of their other products like the VMAX, Data Domain, and SRDF replication to protect through DR and back up processes. If you don’t know what GreenPlum is, it is a very scalable data warehouse product which is based on the Progress SQL platform. The system is configured as a fully redundant system which is scaled out by adding in more x86 servers into the farm. The systems scales easily into the petabyte range and EMC says they have several customers with multiple-petabyte databases running within GreenPlum. The nice thing with GreenPlum is that it comes as a software package you can install on your own hardware but also as a preconfigured appliance as a full rack. The next version will allow you to chain multiple appliance racks together to create a massive GreenPlum appliance farm. It posts some pretty expressive data load and data query rates, and i would love to put it side by side with the Microsoft Parallel Data Warehouse and see how they stack up against each other as far as load times, and data processing times but I’m guessing that neuter company will loan me one of these massive devices for a few weeks.
After the sessions were done came the hard work, figuring out which parties to attend that night. I started with a dinner that my great VAR Ahead IT threw. From there i moved to the Emulex party and ended the night at Brocades party. Don’t get me wrong I wasn’t a perfect angel at these parties but i did take it easy as I do have to be able to do this all again next week at Tech Ed and there is the official EMC party tonight.
I’ve run out of time to blog as I’ve got to get to another sessions, so I’ll wrap up here. Sorry for any spelling errors and the lack of links. I’m using my iPad to write this and may not have caught all the problems.