Welcome to the first day of storage week of SQL University. Over the next week, we’ll be covering some different topics, at different levels, but hopefully the information is useful to everyone.
One of the big reasons that companies pay for storage arrays is that when it comes to expanding storage (LUNs) you can do it quickly and easily with no downtime. This begs the question, how do you expand your storage when you are using local disk, with little to no downtime?
So assume that you’ve got 4 72 Gig spindles in your server, and you want to upgrade these to 400 Gig spindles, you’ve got two options. The first requires a lot of downtime, the second requires basically none.
With a bit outage
If your RAID card doesn’t support online expansion (the documentation for the RAID card will tell you) then you’ll have to take a decent outage to do this. You’ll need to connect another hard drive like a large USB drive, down all the services, and copy all the data off of the disk onto the USB drive (backing everything up to tape will work as well). Then down the server, remove the old drives, put in the new drives and create a new RAID array. Boot Windows back up, create the partition (don’t forget to align the partition) copy the data back to it and you are up and running.
With a really small outage
If you have a RAID card that supports online expansion (the documentation for the RAID card will tell you) then you can do this with little to no downtime. First replace each disk one at a time, leaving enough time to rebuild the RAID array so that you don’t loose any data. Once all 4 disks have been replaced (this will take 2-3 days to get to this point) either open the RAID cards management tools, or reboot into the RAID card’s BIOS and have the RAID card expand the virtual disk (or what ever the manufacture calls it). Once that process is done bring Windows back up if you had to reboot into the BIOS, and open diskpart and use diskpart to extend the volume.
This is done by opening diskpart and typeing in the following:
select disk n
You use “list disk” to find the disk number, then use “select disk n” to select the correct disk where n is the number of the list that shows from the output of list disk. The “extend” command extends the volume to fill the disk.
Something which was talked about at the last SoCal Code Camp was running a bus from SoCal to Phoenix to help our two communities get to know each other better, and to give speakers and attendees easier access to more events.
So along those lines a bus has been setup taking people from SoCal to the next Desert Code Camp and registration is open to get a seat on the bus. Now if you plan on taking the bus don’t wait until its to late to sign up. If enough people don’t sign up for the bus, it’ll be canceled.
If you plan on attending be sure to get signed up soon, and we’ll see you in Phoenix.
Microsoft is putting on a great free three day online class for VMware professionals that need to learn more about Hyper-V.
Day 1 will focus on “Platform” (Hyper-V, virtualization architecture, high availability & clustering)
Day 2 will focus on “Management” (System Center Suite, SCVMM 2012 Beta, Opalis, Private Cloud solutions)
Day 3 will focus on “VDI” (VDI Infrastructure/architecture, v-Alliance, application delivery via VDI)
The class is being taught by two top notch presenters Microsoft Technical Evangelist Symon Perriman (a good friend of mine, who knows his stuff) and leading Hyper-V, VMware, and XEN infrastructure consultant, Corey Hynes.
Each of the days is a separate event so you need to sign up for each one separately. I’ve made each of the days a separate link to make your life easier. Just click through each link and register. The only downside to the amazing training opportunity is that it is happening during the Dev Connections conference, so if you are attending Dev Connections in April, 2011 you won’t be able to take advantage of this amazing FREE (did I mention FREE) training event. As I’ll be at Dev Connections I won’t be able to make it, which is something which I really bummed about.
In case you missed all the talk about SQL Excursions yesterday, I’ll recap here since a lot of people read blogs but aren’t on Twitter.
SQL Excursions is designed to provide fun, educational SQL Server training events. The excursions will be held in beautiful locations across the US (and eventually world wide). Each excursion will also have social events for the session attendees as well as spouses, partners, guests, etc. so that they can come to the excursion and not be bored by the technical sessions.
Each excursion will have top notch speakers, typically Microsoft MVPs and/or Microsoft Certified Master recipients. As each excursion is announced the speaker and session outline will be posted as well. What will make this training different from other kinds of training is that while a basic outline will be posted with what the speakers are planning on speaking about, the exact content will be voted on by you, the session attendees. This will ensure that the you receive the sessions that will be of the most use to you in your day to day work life.
If you are on Facebook I’ve got a page there, and if you are on Twitter there’s the official SQL Excursions account (@SQLExcusions). As the excursion is ready to be announced (there’s a bunch of legal stuff which I need to get out of the way) it’ll be announced on Facebook, Twitter as well as the SQL Excursions web page. There’s a news letter which you can sign up for on the home page of the SQL Excursions web page so that the new information will be sent directly to your inbox.
If you were planning on attending the SSWUG virtual conference “DB Tech Con” which is happening April 20-22, 2011; and you’d like to save $30.00 of the cost of the conference, have I got a deal for you. When you sign up using the discount code SP11DBTechDC you’ll be given a $30 discount off of the current price.
You can sign up on the normal registration page and enter the code SP11DBTechDC into the VIP Code box then press the “Update Registration” button.
Join me Wednesday March 9th at 8am Pacific (11am Eastern) as I join Marathon Technologies as I present a webcast titled “Controlling SQL Server Sprawl: The Consolidation Conundrum and Availability Imperative“. During this session I’ll be talking about some of the benefits and risks consolidating SQL Server databases and instances.
If you have a SAN RAID 5 LUN for your data and a SAN RAID 10 LUN for your logs, and say a local RAID 1 array for the C drive in the server itself, where do you usually install the SQL Server binaries to?
In this sort of configuration, which is pretty normal actually, I would but the SQL Server binaries on the C drive.
The last two weekends I’ve presented at two different SQL Saturday events, so as I’m pretty lazy I’m posting the decks for both weekends in a single blog post.
SQL Saturday 47 – Phoenix
- Back to Basics; Getting Back to the Basics of SQL Server
- SQL Server Clustering 101
- Where should I be encrypting my data
SQL Saturday 65 – Vancouver, BC
Now the slide decks for my encryption deck are the same, just different templates, but I wanted to upload what people actually saw.
So a little while back we noticed that we had some high CPU load coming from a single stored procedure in the Session State database. This single procedure (TempUpdateStateItemLong) was taking up 80% of the CPU time that the database was using (we used Spotlight for SQL Server Enterprise from Quest software). But in another session state database that same procedure was down in the single digits. So something must be different between them.
I opened them both up in SSMS and the code for the procedures was identical (as you would expect), but there was something different. The procedure that had the really high CPU % times was compiled with the SET QUOTED_IDENTIFIER ON setting, while the procedure that had the really low CPU % times was compiled with the SET QUOTED_IDENTIFIER OFF setting.
I have no idea why there was a difference, but I changed the one which was ON to OFF and pushed the procedure into the database. As soon as I did the CPU % for that procedure dropped down into the single digit range where it should have been.
Let this be a lesson, those setting definitely matter. And don’t trust that they are correct, even in Microsoft provided code like Session State.
So a while back we were seeing some very strange behavior with our SQL Replication. Every once and a while for no apparent reason the log reader would just slow down the pulling of records from a single publisher database. Our replication was setup with a single publisher, and a single distributor with over a dozen publications all being sent to a couple of different subscribers.
At random times we would see the latency for all the publications for a single database start to climb, eventually being a few hours behind for no apparent reason. Looking in the normal places didn’t lead me to much. I looked at some execution plans, and saw a couple of performance issues there (with the Microsoft code) so I threw a couple of new indexes onto the MSlogreader_agents and MSsubscriptions tables (see below) and I also made a couple of tweaks to the sp_MSset_syncstate procedure to fix some of the pathetic code which I found within the procedure (you’ll also find this below).
This helped a little, but it didn’t solve the problem. What did was when I queried the sys.dm_os_waiting_tasks dynamic management view. This showed a large number of processes with a wait_type of TRACEWRITE, and these were waiting long enough that blocking was actually starting to pop up (very sporadically making it very hard to see). A query look at sys.traces told me that there were three traces running against the server. I knew that I didn’t have one running, so I took the session_id values which were shown in sys.traces and looked in sys.dm_exec_sessions for those session IDs to find out who needed to be kicked in the junk. Turns out that the traces were being run by Quest Software’s Spotlight for SQL Server Enterprise’s Diagnostic Server (the program_name column read “Quest Diagnostic Server (Trace)”).
So I logged into the diagnostic server’s via RDP, and opened Spotlight . Then edited the properties for the server which is our distributor. Then I opened the SQL Analysis window, and disabled the SQL Analysis for this server. Pretty much as soon as I clicked OK through the windows the TRACEWRITE locks went away, and the latency went from 2 hours down to 0.
This just goes to show, just how careful that you have to be when using SQL Profiler (or any sort of tracing) against your database server.
P.S. If you decide to make these changes to your distributor keep in mind that they may cause anything or everything to break, including patches, etc. that you try and install against the SQL Server engine. These changes were made for a distributor running SQL Server 2008 R1 build 10.0.1600, use against another build at your own risk. That said, here’s the code.
USE distribution GO CREATE INDEX IX_sp_MSget_new_errorid ON dbo.MSrepl_errors (id) WITH (FILLFACTOR=100) GO CREATE INDEX IX_sp_MSadd_logreader_history ON dbo.MSlogreader_agents (id) include (name, publication) GO CREATE NONCLUSTERED INDEX IX_sp_MSset_syncstate ON MSsubscriptions (publisher_id, publisher_db, article_id, subscription_seqno) include (publication_id) with (fillfactor=80) GO CREATE NONCLUSTERED INDEX IX_sp_MSset_syncstate2 ON MSsubscriptions (publisher_id, publication_id, sync_type, status, ss_cplt_seqno, publisher_db) include (article_id, agent_id) WITH (FILLFACTOR=90, DROP_EXISTING=ON) GO ALTER procedure sp_MSset_syncstate @publisher_id smallint, @publisher_db sysname, @article_id int, @sync_state int, @xact_seqno varbinary(16) as set nocount on declare @publication_id int select top 1 @publication_id = s.publication_id from MSsubscriptions s where s.publisher_id = @publisher_id and s.publisher_db = @publisher_db and s.article_id = @article_id and s.subscription_seqno < @xact_seqno if @publication_id is not null begin if( @sync_state = 1 ) begin if not exists( select * from MSsync_states where publisher_id = @publisher_id and publisher_db = @publisher_db and publication_id = @publication_id ) begin insert into MSsync_states( publisher_id, publisher_db, publication_id ) values( @publisher_id, @publisher_db, @publication_id ) end end else if @sync_state = 0 begin delete MSsync_states where publisher_id = @publisher_id and publisher_db = @publisher_db and publication_id = @publication_id -- activate the subscription(s) so the distribution agent can start processing declare @automatic int declare @active int declare @initiated int select @automatic = 1 select @active = 2 select @initiated = 3 -- set status to active, ss_cplt_seqno = commit LSN of xact containing -- syncdone token. -- -- VERY IMPORTANT: We can only do this because we know that the publisher -- tables are locked in the same transaction that writes the SYNCDONE token. -- If the tables were NOT locked, we could get into a situation where data -- in the table was changed and committed between the time the SYNCDONE token was -- written and the time the SYNCDONE xact was committed. This would cause the -- logreader to replicate the xact with no compensation records, but the advance -- of the ss_cplt_seqno would cause the dist to skip that command since only commands -- with the snapshot bit set will be processed if they are <= ss_cplt_seqno. -- update MSsubscriptions set status = @active, subscription_time = getdate(), ss_cplt_seqno = @xact_seqno where publisher_id = @publisher_id and publisher_db = @publisher_db and publication_id = @publication_id and sync_type = @automatic and status = @initiated and ss_cplt_seqno <= @xact_seqno OPTION (OPTIMIZE FOR (@automatic=1, @initiated=3, @publisher_id UNKNOWN, @publisher_db UNKNOWN, @xact_seqno UNKNOWN)) end end GO