So a while back we were seeing some very strange behavior with our SQL Replication. Every once and a while for no apparent reason the log reader would just slow down the pulling of records from a single publisher database. Our replication was setup with a single publisher, and a single distributor with over a dozen publications all being sent to a couple of different subscribers.
At random times we would see the latency for all the publications for a single database start to climb, eventually being a few hours behind for no apparent reason. Looking in the normal places didn’t lead me to much. I looked at some execution plans, and saw a couple of performance issues there (with the Microsoft code) so I threw a couple of new indexes onto the MSlogreader_agents and MSsubscriptions tables (see below) and I also made a couple of tweaks to the sp_MSset_syncstate procedure to fix some of the pathetic code which I found within the procedure (you’ll also find this below).
This helped a little, but it didn’t solve the problem. What did was when I queried the sys.dm_os_waiting_tasks dynamic management view. This showed a large number of processes with a wait_type of TRACEWRITE, and these were waiting long enough that blocking was actually starting to pop up (very sporadically making it very hard to see). A query look at sys.traces told me that there were three traces running against the server. I knew that I didn’t have one running, so I took the session_id values which were shown in sys.traces and looked in sys.dm_exec_sessions for those session IDs to find out who needed to be kicked in the junk. Turns out that the traces were being run by Quest Software’s Spotlight for SQL Server Enterprise’s Diagnostic Server (the program_name column read “Quest Diagnostic Server (Trace)”).
So I logged into the diagnostic server’s via RDP, and opened Spotlight . Then edited the properties for the server which is our distributor. Then I opened the SQL Analysis window, and disabled the SQL Analysis for this server. Pretty much as soon as I clicked OK through the windows the TRACEWRITE locks went away, and the latency went from 2 hours down to 0.
This just goes to show, just how careful that you have to be when using SQL Profiler (or any sort of tracing) against your database server.
P.S. If you decide to make these changes to your distributor keep in mind that they may cause anything or everything to break, including patches, etc. that you try and install against the SQL Server engine. These changes were made for a distributor running SQL Server 2008 R1 build 10.0.1600, use against another build at your own risk. That said, here’s the code.
USE distribution GO CREATE INDEX IX_sp_MSget_new_errorid ON dbo.MSrepl_errors (id) WITH (FILLFACTOR=100) GO CREATE INDEX IX_sp_MSadd_logreader_history ON dbo.MSlogreader_agents (id) include (name, publication) GO CREATE NONCLUSTERED INDEX IX_sp_MSset_syncstate ON MSsubscriptions (publisher_id, publisher_db, article_id, subscription_seqno) include (publication_id) with (fillfactor=80) GO CREATE NONCLUSTERED INDEX IX_sp_MSset_syncstate2 ON MSsubscriptions (publisher_id, publication_id, sync_type, status, ss_cplt_seqno, publisher_db) include (article_id, agent_id) WITH (FILLFACTOR=90, DROP_EXISTING=ON) GO ALTER procedure sp_MSset_syncstate @publisher_id smallint, @publisher_db sysname, @article_id int, @sync_state int, @xact_seqno varbinary(16) as set nocount on declare @publication_id int select top 1 @publication_id = s.publication_id from MSsubscriptions s where s.publisher_id = @publisher_id and s.publisher_db = @publisher_db and s.article_id = @article_id and s.subscription_seqno < @xact_seqno if @publication_id is not null begin if( @sync_state = 1 ) begin if not exists( select * from MSsync_states where publisher_id = @publisher_id and publisher_db = @publisher_db and publication_id = @publication_id ) begin insert into MSsync_states( publisher_id, publisher_db, publication_id ) values( @publisher_id, @publisher_db, @publication_id ) end end else if @sync_state = 0 begin delete MSsync_states where publisher_id = @publisher_id and publisher_db = @publisher_db and publication_id = @publication_id -- activate the subscription(s) so the distribution agent can start processing declare @automatic int declare @active int declare @initiated int select @automatic = 1 select @active = 2 select @initiated = 3 -- set status to active, ss_cplt_seqno = commit LSN of xact containing -- syncdone token. -- -- VERY IMPORTANT: We can only do this because we know that the publisher -- tables are locked in the same transaction that writes the SYNCDONE token. -- If the tables were NOT locked, we could get into a situation where data -- in the table was changed and committed between the time the SYNCDONE token was -- written and the time the SYNCDONE xact was committed. This would cause the -- logreader to replicate the xact with no compensation records, but the advance -- of the ss_cplt_seqno would cause the dist to skip that command since only commands -- with the snapshot bit set will be processed if they are <= ss_cplt_seqno. -- update MSsubscriptions set status = @active, subscription_time = getdate(), ss_cplt_seqno = @xact_seqno where publisher_id = @publisher_id and publisher_db = @publisher_db and publication_id = @publication_id and sync_type = @automatic and status = @initiated and ss_cplt_seqno <= @xact_seqno OPTION (OPTIMIZE FOR (@automatic=1, @initiated=3, @publisher_id UNKNOWN, @publisher_db UNKNOWN, @xact_seqno UNKNOWN)) end end GO
I know that security people like to remove permissions from everything before certifying that a server is ready to go into production. And like 10+ years ago that was something that you might have wanted to do (I’m just talking about SQL Server here). However in today’s world of SQL Server 2005 and newer that isn’t needed. These newer versions are designed to take security much more seriously than before. The rights that are granted to public in the master and msdb databases should be left the hell alone.
If you are going to go around revoking database permissions that you don’t understand what do, don’t come to be complaining that your SQL Server isn’t working correctly. Guess what, those permissions where there for a reason, and should be left alone. If you have some out dated security mandate that says that all database permissions must be revoked from public before the server can be used, then you had damn well better understand what that means. And you should probably update your stupid security policy so that it reflects the changes that have been made in the product over the last 10 or so years. Even under SQL Server 2000 I didn’t ever recommend that people remove all the rights from public, ever. Not if you wanted the SQL Server to work as expected.
If you have decided to go and remove all the permissions, then you will probably want to install a new SQL Server, and find all the permissions there and grant them back. That or restore your master database back to a state from before you screwed it up, which is the same thing that I recommended to the person in the forum thread above do. If you intentionally break your SQL Server don’t expect much sympathy from me.
So just in case you missed it, I took and passed the Microsoft MCM lab last Thursday. Since I’m the first person to go through the test in the new format I wanted to put together some comments (as best I can thanks to NDAs).
So the MCM gives you 6 hours to complete, and when you take it, you’ll need most if not all of the time.
I basically finished in a little over 4 hours, then went back and reviewed a couple of the scenarios which took me until about the 5 hour mark. The exam is made up of several smaller scenarios which are independent and each of which will test your knowledge of one or more parts of the SQL Server engine.
As I mentioned in my prior blog post you will need to know all aspects of the SQL Server engine very well. After taking the lab I can say that I feel this even more now. With the time limitation you need to know the production very well, as well as where to find useful information in Book OnLine. The lab is a closed book exam, and the only reference that you have access to is Books OnLine.
When you do decide to take the exam, don’t expect to get your results back as quickly as I did. Currently Microsoft only has one Lab Exam environment to do the tests on, and that environment had to be reset after I took it for another person so my exam had to be scored very quickly. Once the lab is released for general use scoring will take about 30 business days as the exam will be scored by hand to account for creative solutions to the scenarios as well as to account for creative ways that people may try and cheat on the exam.
I didn’t realize just how worried about taking the exam I was until I was done, and logged off of the environment. I could actually feel the stress just leave once I finished the exam. And this was before I knew if I had passed. I knew that I would get my result quickly because of the next person taking it so quickly, and I knew that if I passed I would be thrilled, but I didn’t realize just how happy I would be to have passed.
When you take the exam, you can’t take any water or food into the exam room, but you can leave some in a locker outside the exam room. You’ll want to bring food with you to eat and drink. Getting hungry during the process sure won’t help you pass the exam, and it’ll just distract you. Just keep in mind that snack breaks do count against your 6 hour window.
Personally I didn’t do a whole lot of studying (I talked about my professional experience in my last post). I watched most of the MCM videos that SQL Skills put out. I skipped ones that were on topics that I can speak on in my sleep like storage, Service Broker, etc just because I felt that I knew that material the best. I focused on the indexing internals, and the security pieces as to me those are the most complex parts of the database engine to me.
Hopefully this helps you on your trip though the MCM process.
OK, that’s totally BS, but it sounds awesome doesn’t it.
But it is sort of true. I was grabbing the URL for my book to send to someone today, and I noticed that the Kindle version had a price listed. The book is $39.96 at the moment (I don’t set the price), so all of you that were waiting for the Kindle version, your wait is over. Amazon does have the paperback in stock again ($32.64 at the moment) if that is your preference.
All to often when I give a SQL Service Broker presentation, or when I’m talking to people about it they want to know what the odds are that SQL Service Broker will be removed from the platform like how Notification Services was ripped from the product after just one release.
So here’s just some of the SQL Server features which would need to be re-architected if SQL Service Broker was removed from the SQL Server product.
- Database Mail – Introduced in SQL Server 2005
- Database Mirroring – Introduced in SQL Server 2005
- Always On (HADR) – Introduced in SQL Server “Denali”
There is some other stuff which uses SQL Service broker, mainly BizTalk can use it, and I’ve heard rumors that Microsoft CRM will start using it at some point (if it doesn’t already). I’m sure based on the way the SQL Server product team has been going that even more new features will be based on SQL Service Broker.
So if you were worried about using SQL Service Broker because you’ve heard that not many people are using it, don’t be. The feature isn’t going anywhere.
Apparently people still haven’t figured out that taking backups of SQL Server databases is actually a requirement of having a SQL Server database. There was a conversation recently on Twitter about yet another forum question where the OP talked about the crashed database that they had without any backups what so ever.
Some people don’t seem to understand that server hardware failing isn’t a question of if the hardware will fail, but rather when the hardware will fail. If you don’t have backups (that you have verified will work) then you haven’t done what needs to be done to protect yourself from WHEN that hardware fails. If you’ve been working with SQL Server for a while now and you haven’t had a hardware failure, then remember one simple thing… That hardware failure is coming, and when it comes to kick you in the junk, it is going to hurt.
But I work for a small company, and we can’t afford to do backups.
If you work for a small company, then doing backups if even more important. If you are a small company, can you afford to lose all of the data within your SQL Server database? How much money will it cost the company to rebuild all that data? How much money will it cost the company if you loose all of the contacts that are stored within the database? Will the company that you work for go out of business if the data within the database is lost? Will you as the systems admin or database administrator be fired if the data within the database is lost?
Most of those questions I can’t answer for you, except for the last one. If I was your manager, and you weren’t backing up the database that ran our business, and that system failed you would be out on the street in a heart beat. Odds are as your manager I’ll be getting fired about an hour after you, since I didn’t make sure that you were backing up the data which runs the company.
But I don’t no how to do backups.
Setting up backups doesn’t need to be all hard and complex. For companies with just a couple of database servers, you don’t need any sort of large complex backup solution, or hard written scripts. Using the database maintenance plan wizard will work just fine for database backups. People in large companies don’t like using it, and myself and other MVPs complain about it, because it doesn’t scale when you have a lot of servers. But as far as the actual backups them selves, it’ll work just fine for you.
Assuming that you need to be able to loose as little data as possible, you’ll want to put your databases into full recovery mode (not the system databases, just your user databases). Then setup full backups to run either daily or weekly, with transaction logs running at regular intervals. This is where you need to ask your manager or the company owner how much data you can afford to loose. This will tell you how often you need to setup the transaction log backups.
To setup all these backups you’ll want to setup two maintenance plan packages (I’m talking about SQL Server 2005 and newer here, I’ll talk about SQL 7 and SQL 2000 in a minute). One to do the full backups and one to do the log backups. In both maintenance plans use the “Back Up Database Task”. In the plan which will do the full backups, change the backup type to “Full” and change the Database(s) option to “All databases”. Change the backup folder to a folder on the local server which doesn’t have any databases on it. If there aren’t any drives on the server that don’t have any databases on it, change it to a network path on another server, like a file server. If that isn’t an option a large USB drive is better than nothing. Schedule the job to run daily.
For the transaction log backup maintenance plan do the same thing, but change the “Backup type” option to “Transaction Log” and change the Databases to “All users databases (excluding master, model, msdb, tempdb)”. Change the path to the same one as the full backups, or a similar one. Schedule the job to run every 15 minutes (or whatever schedule is needed based on discussions with management). You’ve now got a SQL Server backup, so that when the server fails you can restore.
You also need to clean up the backups, so that you aren’t keeping 5 years of backups on disk. In the full backup maintenance plan add a “Maintenance Cleanup Task”. Point this to the same folder, and have it delete files based on the file age (the bottom check box). I recommend keeping at least 1 week or data on disk.
In SQL Server 2000 you just need to run through the maintenance plan wizard. This wizard will allow you to select which databases you want to backup via both full and log backups. Once this wizard has been completed there will be a couple of different jobs which are created to do these backups. In some respects it is easily to setup the backups via the maintenance plans in SQL 2000 as you get a nice quick and easy wizard to run through. SQL 2005 and up don’t really give you this feature as the wizard in SQL 2005 and up really sucks. When going through this wizard do NOT select the automatically fix index problems checkbox. You want to be notified of problems, but not be fixing the problems automatically.
But I backup the server using backup exec or something that backs up all the files.
In other words you aren’t doing any sort of backups. Backing up the SQL Server database files from in Windows means that you aren’t doing anything useful for a couple of reasons.
- SQL Server keeps the data files locked so that other applications can’t access them and corrupt the databases. Because of this when a backup application comes through to backup the files it can’t and you get corrupt backups, which are in other words no backups.
- Even if the files weren’t locked, the backups would still be useless because the data file and the log file wouldn’t be backed up at the same time. With SQL Server the files must be in sync with each other. If you are backing up the data file and log file, the data file would probably be backed up first, then when it is backed up the log would be backed up. So the timestamp of the backup of the data file would be several minutes or hours (depending on the size of the data file) before the log file. Depending on the amount of data change, the data file might not be consistent with it self.
If I haven’t convinced you yet that you should start doing backups (remember that part above about losing your job, and if they fire you for not backing up the databases I wouldn’t count of a good recommendation when you are job hunting), what will it take to get you to start actually doing it? Now here I’m being serous, comment here, post a blog post, tell me on twitter (@mrdenny), whatever just let me know what it’ll take to get you to start doing the backups.
So I drug myself out of bed this morning to an email from Sean McGown (Blog | Blog | Twitter) saying that Amazon had just told him that they would be shipping my book. So I hopped over to the Amazon page for the book, and low and behold it is no longer listed as pre-order. In fact there was another notice up there, that they only have three copies left in stock.
This means that either Amazon either ordered just enough copies to cover the pre-orders, or they ordered a heck of a lot of copies and the SQL Server community bought them like crazy. Personally I’m hoping for the second reason and not the first. But when I look at the sales graph that Amazon shows me, it might actually be the second.
If you look at the graph to the right (I’m a DBA, we love data) you can see how the book has been ranked on Amazon’s best seller list over the last month. What I really like is that yesterday, on my birthday, the book breached the 100,000 mark for just the second time. The first time was back on October 22, 2010 which the book was ranked #99,971 overall. Well the rank yesterday was #49,111. And unlike every prior spike, this wasn’t a one day spike. There were actually two days of positive climbing in the rankings. Does this tell me that the book has been selling like crazy? No, well sort of no. The book rankings are based on daily sales, and as you get closer to being #1, it takes more and more sales to move up the rankings.
There’s no specific sales data available for the book yet that I can see, that all comes from BookScan (a company that gathers sales info from all the major retailers) so it’ll take a week or two for data to show up, but at least I can see some general info about the book so I’ve got a general idea.
Now I get to move into another party of the book writing phase that I like to call “Please tell me you liked it”. Now I have to sit around and wait for the revues to be posted about the book. Personally I’m pretty proud of it, I should be proud of it I wrote the thing after all, but I hope that you the SQL Server community at large like it, and more importantly that you (or someone you know) find it useful. If just one (or maybe two) people are able to better protect their SQL Server data because of the book I’m going to call it a success.
I think I’ve rambled enough for the day. I’ve got to get back to my MCM study.
As many of you know, I took the Microsoft Certified Master (MCM) knowledge exam back in November, and received notice that I passed back in December. The MCM knowledge exam is the first of two tests that you have to pass in order to receive the MCM certification, after you take and pass the MCITP database administrator and MCITP database developer exams of course. The second of the exams, is a 6 hour lab which is supposed to be just as hard as the knowledge exam, if not harder.
I’ve been asked a couple of times what my test taking experience was like, how much experience I have as a DBA, etc. So that’s pretty much what I’m going to talk about in here. I won’t be talking about the kinds of questions, or whats on the test as that would be an NDA violation, but this may give you some ideas about the experience it self.
The testing center
When you go and take the MCM exam, you can’t just go to any testing center and take it. You have to go to a special high security testing center. Normally when you take an MSL (Microsoft Learning) exam you just turn off your phone and leave it in your car. At these facilities you take nothing with you, except for your drivers license, which is to remain on the desk at all times, and your little white board for notes that you want to take during the exam. Everything else, wallet, keys, etc. goes into a locker which you have a key for with you (also sitting on the desk). IDs are checked every time to enter and exit the testing room (for a bathroom break, etc). When you go take the test, expect an almost TSA like inspection. OK they don’t actually pat you down, but you do have to pull your pockets out to show they are empty.
The test it self
The test itself is hard, very hard; as well it should be. The goal of the exam is to weed out the people that don’t have the requisite knowledge and experience at the level expected. I’ve been told that this exam is actually harder than the prior one that the people that went through the class room training took. I’ve got no idea if this is true or not, but either way, the test was very tough. I had no idea if I had passed or not until I got the email stating that I had passed (people that aren’t in the beta program for the exam probably won’t get an email, but will instead just get the notification from Prometric).
My SQL Server Experience
A lot of experience with the SQL Server database engine is definitely a requirement to passing the MCM Knowledge Exam. Personally I’ve been using Microsoft SQL Server since SQL Server 6.5, although most of my experience has been with SQL Server 7.0 and beyond. I really started getting into SQL Server back in 1998 or 1999, so I’ve got about 12-13 years experience with the product as both a database administrator as well as being a database developer. A long amount of experience is, in my opinion, absolutely critical to passing the MCM exam. Without it you won’t have the breadth of knowledge of all the various features which the MCM knowledge exam is going to test you on. The MCM exams which Microsoft has released (and were recorded by SQL Skills) are a good starting point for people that don’t have a similar level of experience, however they are not all that you need. A working knowledge of the production from practical on the job experience using all the features for several years will greatly increase your odds of passing.
If you plan on taking the MCM exam, be very sure that you know your stuff, well … very well. Several people that know SQL Server well have taken the MCM knowledge exam and have not yet passed the exam. This should drive home the point that you need to know the database engine very well before taking it.
Syngress, my publisher, is doing a one day sale for my book Securing SQL Server. If you order from them directly and use discount code 50170 you’ll get 50% off the price, bringing the price down to just $24.98. Just click through to the Syngress order site and use the discount code 50170 during check out.
This 50% off code is only good today Feb 1, 2011 so ACT NOW.
UPDATE: To order the paperback, click the “Buy Now” link with the arrow under Paperback and select “Elsevier” from the list. On the new page add the book to your cart, click the cart in the upper left and use the code there.
UPDATE2: I’ve been told that if you use Chrome, it will say the site is not secure. This is apparently because some stuff isn’t protected by HTTPS even after you change the URL to HTTPS.
So this last weekend was the 13th SoCal Code Camp (the 6th at this location). I presented three different sessions at this code camp. My slide decks can be downloaded below.
If you attended the sessions please rate my presentations on SpeakerRate.com/mrdenny.