VMware’s vCenter system uses SQL Server as it’s default database platform. One thing that needs to be looked into for a lot of VMware installations is the backup setup for the vCenter databases. Often I find the vCenter databases configured in SIMPLE recovery. Now this doesn’t mean that the database is simple to recover in the event of a failure. What this actually means is that there is no point in time recovery of the vCenter databases. Now this might be OK for smaller systems as there probably aren’t a whole lot of changes being made to the vCenter farm other than performance data. As long as you are OK with loosing a days worth of performance data then keeping the vCenter databases in SIMPLE recovery might be OK.
However for larger farms lots of changes are probably being made to the configuration of the farm on a daily bases. Because of this you won’t want to run the database in SIMPLE recovery. Instead you’ll want to run the database in the FULL recovery mode. This enables the ability to do point in time recovery of the database once you start doing database transaction log backups. This means that when (note that I didn’t say “if”) the hard drives on the vCenter server fail and cause the database to be lost
of become corrupt you’ll only loose a few minutes of data at the most.
To ensure that you don’t loose a lot of data from the vCenter database you will need to begin doing transaction log backups. This will do a couple of things. First it’ll make it so that you can restore the database successfully. Second it’ll make sure that the hard drive on your vCenter server doesn’t fill up with one giant transaction log file. If you don’t have a DBA in your shop, and you’ve got a SQL Server with a license that you paid for (something other than the SQL Express that VMware installs if needed) you can setup the SQL Server Maintenance Plans to handle the backups using SQL Server Management Studio. You can always bring in a SQL Server consultant to setup backups and make sure that everything is setup correctly and working if you don’t have a SQL Server DBA handy. If you’ve got a DBA handy in your shop you’ll probably want to have them setup SQL Server backups as per your normal company policies so that the databases are backed up, and those backups tested to ensure that they are functional.
Personally I like to set the recovery mode for the vCenter databases to FULL when working at clients, but it’s totally up to you, as long as you understand what the risks are with each of the options available to you. (Yes I know that I didn’t cover the BULK_LOGGED recovery model, for the purposes of VMware’s vCenter databases the BULK_LOGGED recovery model and the FULL recovery model are the same.)
Hopefully I have helped demystified the SQL Server backup concepts for you a little bit.
Now you as a database professional have the chance to earn Storage and Virtualization from home, work, or where ever is convenient for you. As a partnership with SSWUG.org I’ve recorded an 11 session class titled “Storage and Virtualization for the DBA” which is being delivered online via SSWUG’s virtual conference platform which will allow you to view the videos on demand on your schedule between November 15th and December 14th. If you want to follow up with the class and watch the videos later a DVD of the class is available as well.
These classes from SSWUG include office hours with the instructor daily. Because of a trip to Europe for SQL Saturday and SQL Server Days the office hours won’t be available until starting on November 20th. However any questions can be emailed to me and I’ll get back to you as quickly as I can. You’ll receive more information about the office hours when you sign up for the class.
The cost for the class is very reasonable at just $199 for those who are not SSWUG members and the DVD is another $199, so get registered now.
I’ll see you online during the office hours, I look forward to your questions.
Email is pretty much the most annoying necessary evil out there. Another annoying necessary evil out there is the out of office message. Normally these aren’t to annoying over all as they let you know that someone that you’ve sent a message to isn’t going to be responding for a while.
Where they start getting a bit of a pain is when you are using gMail and you are on mailing lists, specifically mailing lists that are setup correctly so that when you hit reply the response goes back to the mailing list not to the person who sent the email. Thanks to the Microsoft MVP program I’m on several of these email lists. With out of office replies everyone on the mailing list quickly wants to kill you because every time a new person emails the mailing list everyone on the list gets your out of office reply. When you are on a mailing list of 400 people this gets old, fast.
GMail has an out of office feature but it is “lacking” to say the least. You basically get to turn it on and off, set dates for auto on and off, set the subject and the message. The two big options after that allow you to only send your out of office messages to people in your contact list or on your domain. The in your contact list option in my case is pretty useless. I get emails from people all the time that aren’t in my contact list and I want them to get the out of office message, and I half the people (if not more) exist in my contact list as I email them off list as well as on (not to mention that I need their phone numbers so I can properly stalk them during the PASS and MVP summits). So that option is useless. I really don’t need the last option as the only other person with a mrdenny.com email address is my wife Kris and she knows where I’m at.
So my big problem becomes how to use out of office without all the other MVPs wanting to kill me.
And I’ve found the solution, and without having to move all my mailing list email traffic to a separate mailbox. First go into the options for gMail and select the “Labs” tab. Then find the “Canned Responses” option and turn this on. We don’t care about the main functionality of this feature, but we do care about the fact that this lets us setup a filter and have it send a reply automatically. So set this “lab” to enabled and click save.
Now compose a new email including just a subject and body of the message. Then click the new “Canned responses” drop down (next to attachments) and tell it to save the canned response. Name it something useful like “Out Of Office” or something so you can easily find it.
Then go back into your mail settings and go to the “Filters” tab. Create a new filter and set the to address to exclude the mailing lists that you are on. In my case I get email from a couple of different email list domains so I just excluded everything from those domains as you can see in the graphic. I also excluded myself from the filter as I don’t want to get emailed my own out of office message. I leave all the other boxes blank and click the “Test Search” button to make sure that everything but your mailing list emails are found then click “Next Step >>”. On the second page check the new “Send canned response” check box and select the out of office message you just saved. Then click the “Update Filter” button. What ever you do, do NOT check the “Also apply filter to N conversations below” (In my case N actually says 12000, your number will vary). If you check that box before you click the “Update Filter” button you will send everyone that has ever emailed you the out of office message that you just setup. Not exactly a good way to become popular.
At this point you are done, and you’ve got an out of office setup that will ignore your mailing lists. This may take a little playing to get right. There is one thing that the normal gMail out of office does that this technique doesn’t do which is nice. The normal gMail out of office will only email someone your out of office once every four days where this emails them every time. That’s a minor annoyance compared to pissing off everyone on the mailing lists that I’m on.
Don’t forget to delete the filter (as there is no way to disable a filter) when you get back from vacation so that you don’t keep sending out of office messages to people.
I ran across an interesting problem the other day. I was setting up replication for a couple of VERY large tables. These two tables between there are about 553 Gigs in size. There isn’t enough room on the distributor, and there is a slow link between the distribution server and the subscriber. After 6 days of the snapshot trying to be pushed to the subscriber I killed it because it just wasn’t getting anywhere fast.
Now normally I would just copy the snapshot to the subscriber and run the distribution agent on the subscriber while the snapshot loads (I thought I had a blog post about this but apparently I don’t, so I’ll blog about that later). The problem with doing that in this case is that the distributor doesn’t have enough drive space to create the snapshot and being that the servers are hosted with RackSpace adding 600 Gigs of SAN drive space to this cluster would cost about about $12,000 a month with a one year contract on the new storage. Fortunately there’s another cluster that has enough DAS space available so I can simply point the replication agent to use a network share on this drive instead by using the sp_changepublication procedure as shown below.
exec sp_changepublication @publication='PublicationName', @property='alt_snapshot_folder', @value='\\Server\NetworkShare\'
However when I get to the subscriber I don’t want to read from the network share. I want to copy the files to the local disk and read from there. There’s no switch when running the distribution agent manually so before starting the distribution agent on the subscriber you have to manually edit the dbo.syspublications table on the subscriber, specifically editing the alt_snapshot_folder column so that it points to the local path that the files will be stored in on the subscriber. You can then run the distribution agent on the subscriber and the snapshot will load. Once the snapshot is loaded kill the snapshot on the subscriber and start it on the distributor as normal and all will once again be well with the world.
A question that I get when I talk about doing this is why copy the files over the network instead of just letting the distribution agent load the snapshot remotely. I’ve found (and I’ve done this several times to deal with slow networks) that the time spent compressing the snapshot files (and they compress really well as they are just text) and copying them over the network is almost way less time then it would take to load the snapshot remotely from the distributor.
In the case of this project the publisher and distributor were in Dallas and the subscriber was in New York City. But I’ve used this same technique from Orange County, CA to Dallas Texas; from Dallas, Texas to Los Angeles, CA; and from Los Angeles, CA to Beijing, China. In every case it’s a little slow, it’s a little annoying, but it works (in the other cases the distributor did have enough space to hold the snapshot so I didn’t have to modify the dbo.syspublications table, but everything else I did was the same).
Apparently one of the things that VMware’s support department will often recommend that people do is to change the recovery mode of the vCenter database within SQL Server from FULL to SIMPLE recovery because it will magically make a lot of disk problems go away. While this is true, your disk space issues on the vCenter database will go away at this point, this isn’t magic and you have have just lost some functionally that you weren’t expecting to loose.
The big thing that you loose when doing this is that you can no longer restore the SQL Server database to a specific point in time if you have a system failure. If you don’t need the ability to restore so a specific point in time, and restoring to your latest full or differential backup that you have (you do know that you need to backup the vCenter databases right?) then changing the recovery mode from FULL to SIMPLE is just fine. However if you do need that ability (and I would recommend that you have that ability so that if the vCenter server fails or someone deletes data you can get back as much trending data as possible) then you will want to leave the recovery mode as FULL, but instead setup transaction log backups (hopefully you have a DBA or a database consultant that can help you with this) so that they run every few minutes (I typically recommend every 10-15 minutes). Just keep in mind that when selecting how often they should run in the event of a server failure you may loose up to that amount of data from the database. So if you aren’t aloud to loose more than 10 minutes worth of data, don’t backup the log every 15 minutes, back it up every 10, or maybe every 8 just to be safe. Make sure that you get in writing from your management that this is the config that they want so if you get yelled at you are covered.
If you haven’t ever backed up the transaction log your disk is probably pretty full (or totally full). In this case you will want to shrink the database just once. But before you do you need to know what size you should shrink it to. This can be much easier if a DBA does it, but I’ve got faith in you to not screw it up.
First connect to the SQL Server using Management Studio. Take a backup of the transaction log to disk by right clicking on the database, selecting Tasks from the context menu then selecting Backup from the pop out menu. Select transaction log backup, set a file to backup the log to (you’ll need a lot of disk space for this) and make sure that there’s only one file listed in the file list then start the backup. This will probably take a while.
If you don’t care about loosing the log for now change the database from full to simple recovery and back. Do this by running this script.
ALTER DATABASE vcdb SET RECOVERY SIMPLE GO ALTER DATABASE vcdb SET RECOVERY FULL GO
Now do a full backup as shown above (just do a full database backup instead of a transaction log backup).
Now with everything running as normal wait for how ever long it’ll be between transaction log backups. Then connect to the SQL Server in SQL Server Management Studio. Right click on the vCenter database (vcdb) and select Reports from the context menu, then “Disk Usage” from the list of reports. At the top it’ll give you the size of the transaction log in MB, and the graph on the right will show you a percentage breakdown of how full the transaction log file is. Take the Used percentage and double it. Now take that percentage and multiple it by the size of the transaction log. That’s the minimum size you should set your transaction log for. Personally I would round up to the 8 gig size above that size. If you need 150 Megs of log space, set the log for 8 Gigs in size. If you need 10 Gigs of log space, let the log for 16 Gigs of space. A little extra space used is fine, it is better for performance of the database if we don’t need to grow this transaction log. You can shrink the transaction log by using the DBCC SHRINKFILE statement as shown.
DBCC SHRINKFILE ('vbdc_log', 8388608)
Keep in mind that the number, in this case 8 Gigs, is passed in in megs. Do NOT for any reason create any sort of scheduled task to regularly shrink any database file. This will do nothing good and will cause performance problems of its own. If anyone has told you to setup a job which shrinks the database file regularly they are WRONG.
Going to the PASS summit is an amazing experience as you’ve probably noticed from all the wrap up posts that you’ve seen. However the downside to PASS is the sad reality that after the summit has ended you have to get back to normal work and your normal life. But that isn’t actually true. You can extend that awesome PASS feeling by attending other events like SQL Excursions, SQL Saturdays, etc. Were you can continue to get your SQL learning on, networking with other SQL Server professionals and having a good time while you enhance your career, improve your skills for your current role, have a great time and make new friends for life.
What ever you do, don’t make the SQL PASS experience a short one week a year experience, make it something that you do through out the year, every year. Your employer will thank you as you develop new skills to help them now, and you and your family will thank you as you work on moving your career to the next level.
For you managers reading this, don’t worry that your employees will jump ship when they get back from a conference. While some might, if they do they were going to leave anyway and not sending them wasn’t going to stop them. Odds are you’ll probably get some great leads for potential new SQL professionals when you send your people to conferences. I’ve even heard of people getting home from a conference with a new job, and giving their old employees a great referral for a replacement.
Hopefully everyone is able to slowly get back into regular life, until the next time we all meet.
Everyone has their favorite way or running a block of code over and over again. If you have to run an entire batch over and over again there is nothing as quick and easy as using the “GO n” syntax. When you put a number after the GO batch separator SQL Server Management Studio (as well as SQLCMD) will run that entire batch however many times that you have specified.
Recently I needed to create some dummy data in a table, so instead of writing a loop that counted the number of times the loop ran, I simply used code like this.
CREATE TABLE SampleData (id int identity(1,1), SomeData varchar(100)) go insert into SampleData (SomeData) values ('This is my sample data') go 100000
When run, the batch with the insert statement ran 100,000 times generating all the data that I needed with just 7 extra keystrokes.
Since I didn’t make it to the SQL PASS keynotes this morning, I figured that I’d tell you about another little event that happened late last night. Last night there was a SQL PASS first. There was a wedding here at the PASS summit. Tamera Clark (blog | @tameraclark) and Terry Tyler (blog | @airbornegeek) who are both DBAs, and first time attendees decided a while back that if they both were able to come to the SQL PASS summit that they would get married at the SQL PASS summit. This wasn’t exactly a rash decision on their part as they’ve been living together for many years. Conveniently enough I’m a registered minister and they asked me to perform the ceremony.
The wedding was very short and was held down by the water over looking Waterfront Park here in Seattle, WA just a short walk from the Washington State Convention Center and all our hotels. Microsoft sponsored the rehearsal dinner (there was a Microsoft sponsored Tweet Up that everyone in the wedding party went to) before hand. Confio was kind enough to provide the “bouquet” (it was a Confio umbrella that was given out to first timers, so it was new, borrowed as well as blue). And PASS was kind enough to sponsor the wedding reception (the annual PASS party over at Game Works) which the happy couple and several of the wedding guests went to.
I’ve got to say, this was definitely a first for the SQL PASS summit and for me and I’m so glad that they offered to let me be a part of their wedding. Pat Wright was kind enough to take some pictures before, during and after the wedding ceremony which either he or Tamara and Terry will be posting online somewhere when they get a chance. Please join me in congratulating the happy couple either here, on their blogs somewhere or on Twitter.
Hopefully next year the SQL PASS summit can be just as awesome as this years was (really is as it isn’t over yet). It sure as hell will be hard to beat this year.
Welcome to SQL PASS Summit Keynote on Day 2. Today was another day of announcements made by Quentin Clark who is the Corporate VP for Microsoft who is in charge of the SQL Server database engine.
Quentin started his talk about how SQL Server 2012 will improve the way that SQL Server provides many 9s of up time with the introduction of Always On. The StreamInsight feature now also includes an HA option which will allow you to keep the SteamInsight system processing all the time.
The first demo for todays keynote was my favorite SQL Server 2012 feature, Always On. In the demo the showed a three node cluster which hosted the production application with the cluster in two data centers. During the demo the presenter quickly and easily setup. After setting up a readable secondary the Availability Group was marked within the Always On routing configuration as a readable secondary and the reporting services reports were configured to automatically route their connections to the readable secondary in order to keep the reporting workload off of the production SQL Server. (I’ll be blogging more about this feature later.)
In the second part of the keynote Quentin talk about the ColumnStore indexes and how this feature will make running reports and data analysis much, much quicker. Quentin then talked about how PowerPivot has been enhanced, specifically when implemented along with SharePoint which will allow users to create their own reports within the IT controlled platform of SharePoint (which is great unless you don’t have / can’t afford SharePoint).
The next demo was for a set of features which are the Data Quality Services (DQS) and Master Data Services (MDS) features. DQS is a pretty cool feature which allows you to clean up data problems like incorrect lat/long based on the address of a location by using data from the SQL Azure data market place. During the demo there were a couple of stores which were located in the middle of the water around Seattle. By using DQS the stores location was able to be quickly and easily moved from the middle of the water to the correct location in Seattle.
Quentin then talked about the Fast Track program which allows hardware vendors to sell pre-packaged, pre-configured systems with SQL Server on them which can make it easier for customers to purchase their SQL Server servers. Another great program that Microsoft has is the appliance solution where you can get a server from the vendor up and running in just a couple of hours ready for data to be loaded into it.
Microsoft introduced the HP Enterprise Database Consolidation Appliance. This system is basically a private cloud appliance which is a rack of servers with 192 cores, 400 disk drives (supporting up to 60k SQL Server IOs) and 2 TB of RAM. This can be purchased as a half rack or a full rack appliance. As I understand this appliance this is effectively a Windows Hyper-V cluster where you can fire up VMs within the system quickly and easily. Where the system really saves you is the ability to deploy as a single unit quickly and easily without having to manually test all the components as HP will be testing and certifying that the hardware is working correctly before shipping the unit. From the support side of things you get a single phone number to call in order to get support for both the hardware as well as the software.
The next announcement for today was a traditional ODBC driver for the Linux platform in order applications running on non-Microsoft platforms that can connect to a SQL Server database without any real change to the application. Quentin also announced that Change Data Capture was now supported for SSIS as well as Oracle (not really sure what this means or how this works).
Semantic search was the next feature which was demoed. Semantic Search is used along with the new file table feature which can load files into the database through the file system, then Semantic Search can be used to read the files and not just make the values searchable but it can understand the data within the files so that you can find files based on specific searches but you can then find related documents to what you just found by using not just specific search terms but by using the original document to find the matches.
Another name change for this week is that project Juneau is now called SQL Server Data Tools.
You can now from within SQL Server Management Studio right click on a database and migrate the database directly to SQL Azure. This is done by creating a DACPAC (v2.0) and pushing it to the Azure with just a few clicks. Another Azure feature which is now supported is the ability to backup local or Azure database to Azure file storage by creating what is called a bacpac which is then copied to the Azure file storage. So for we haven’t seen a way to do this through T/SQL. You can also use this technology to backup your Azure database to the Azure file storage then use SSMS to connect to the file storage and restore the database to a local SQL Server instance for dev and testing.
SQL Server Azure is now supporting federations on the database side of the Azure platform in order to dynamically scale the database platform quickly and easily through the management portal without any application changes. With the next year end release of SQL Azure larger databases up to 150 Gigs will be supported as well as the ability to use a different collation (this is using contained databases).
Available today in CTP Microsoft has introduced the SQL Azure Reporting CTP as well as the SQL Azure Data Sync CTP. The reporting CTP allows you run SSRS reports from the SQL Azure cloud. The Data Sync CTP (which is the final CTP) allows you to easily sync data between a local instance and an Azure instance or from an Azure instance to a SQL instance at another cloud provider.
Hopefully you are as excited about some of these new features as I am, as some of this stuff just looks really, really cool.
Today was Day 1 of the SQL PASS summit. Today there were a few announcements during the keynotes which if you didn’t see on the live streaming site you probably want to hear about.
During the PASS portion of the keynote we also heard about the massive amount of training the the SQL PASS organization provides to the community. PASS has counted over 430,000 hours of training which has been delivered to PASS members. That number will sky rocket over the next three days as people go through the SQL PASS summit sessions.
There was a great shout out during the keynote for the SQL MVP Deep Dives book which was officially released at the PASS Summit. This book, which if you aren’t aware of was written by 55 different SQL Server MVPs with all the profits going to Operation Smile.
The first and probably most important announcement was the official name of the SQL Server “Denali” product which is officially named SQL Server 2012 and will be released in the first half of the year 2012. These announcements as well as the rest of the keynote was given by Ted Kummert the Senior VP for Business Platform Tools (aka. Bus Plat) of the SQL Server product at Microsoft.
There was a lot of talk from Ted about the Azure cloud which is a big change from the on premise installations of SQL Server that Microsoft has been selling for the last 20 years. The reason for the big push is that a lot of customers are moving some data up to Azure. Now don’t worry this doesn’t mean that on site installs are going away. Microsoft is working hard on great new features which will be coming out in the future releases of the SQL Server product that we know and love. Azure simply gives you a additional choice besides building and running systems on site so that you can be more flexible and so that you don’t need to worry about things like hardware and patching which is all taken care of for you in the Azure cloud.
If you are a used of “Project Crescent” this has been officially named “Power View” (I’ve been asked to tell you that there is a space in the name). It was also hinted during the keynote that more BI tools will be introduced to SQL Azure in 2012. What flavor this comes in is any ones guess, but it’ll be an interesting thing to see for sure.
Ted also spoke specifically about Big Data and some of the directions that people will be going with big data. Companies today simply don’t throw any data away, and some times that data is unstructured and that’s really the best way of using the data. Microsoft will be getting into this market more with the distributed compute platform which means that Microsoft will be supporting Hadoop as a part of the data platform. This means that you’ll be able to run Hadoop on Microsoft Windows as well as on the Azure platform.
As of last week you can get the Apache Hadoop connector for SQL Server and the PDW platform so that you can connect SQL Server to Hadoop. Future releases include the Hadoop based distribution that I spoke about above, as well as an ODBC driver and add-in for Excel and office to make it easier for people to get Hadoop data into the office platforms. I think that this is really a big game changer as this allows for a very different data storage platform than they are used to with traditional databases platforms. Microsoft will have a CTP version of their Hadoop platform available on SQL Azure before the end of the 2011 year.
Microsoft announced “Data Explorer” which allows users to do self service BI without realizing that they are doing self service BI. This allows users to easily see and read the data, most importantly taking the data and turning it into information that they can use to drive the company quickly and easily. I’m not going to bore you with anything about this demo as it sucked. They pretty much lost the audience within a couple of minutes.
Don’t forget to check back tomorrow as there will be more great announcements from Microsoft during tomorrow’s keynote.