I ran across an interesting problem the other day. I was setting up replication for a couple of VERY large tables. These two tables between there are about 553 Gigs in size. There isn’t enough room on the distributor, and there is a slow link between the distribution server and the subscriber. After 6 days of the snapshot trying to be pushed to the subscriber I killed it because it just wasn’t getting anywhere fast.
Now normally I would just copy the snapshot to the subscriber and run the distribution agent on the subscriber while the snapshot loads (I thought I had a blog post about this but apparently I don’t, so I’ll blog about that later). The problem with doing that in this case is that the distributor doesn’t have enough drive space to create the snapshot and being that the servers are hosted with RackSpace adding 600 Gigs of SAN drive space to this cluster would cost about about $12,000 a month with a one year contract on the new storage. Fortunately there’s another cluster that has enough DAS space available so I can simply point the replication agent to use a network share on this drive instead by using the sp_changepublication procedure as shown below.
exec sp_changepublication @publication='PublicationName', @property='alt_snapshot_folder', @value='\\Server\NetworkShare\'
However when I get to the subscriber I don’t want to read from the network share. I want to copy the files to the local disk and read from there. There’s no switch when running the distribution agent manually so before starting the distribution agent on the subscriber you have to manually edit the dbo.syspublications table on the subscriber, specifically editing the alt_snapshot_folder column so that it points to the local path that the files will be stored in on the subscriber. You can then run the distribution agent on the subscriber and the snapshot will load. Once the snapshot is loaded kill the snapshot on the subscriber and start it on the distributor as normal and all will once again be well with the world.
A question that I get when I talk about doing this is why copy the files over the network instead of just letting the distribution agent load the snapshot remotely. I’ve found (and I’ve done this several times to deal with slow networks) that the time spent compressing the snapshot files (and they compress really well as they are just text) and copying them over the network is almost way less time then it would take to load the snapshot remotely from the distributor.
In the case of this project the publisher and distributor were in Dallas and the subscriber was in New York City. But I’ve used this same technique from Orange County, CA to Dallas Texas; from Dallas, Texas to Los Angeles, CA; and from Los Angeles, CA to Beijing, China. In every case it’s a little slow, it’s a little annoying, but it works (in the other cases the distributor did have enough space to hold the snapshot so I didn’t have to modify the dbo.syspublications table, but everything else I did was the same).
Apparently one of the things that VMware’s support department will often recommend that people do is to change the recovery mode of the vCenter database within SQL Server from FULL to SIMPLE recovery because it will magically make a lot of disk problems go away. While this is true, your disk space issues on the vCenter database will go away at this point, this isn’t magic and you have have just lost some functionally that you weren’t expecting to loose.
The big thing that you loose when doing this is that you can no longer restore the SQL Server database to a specific point in time if you have a system failure. If you don’t need the ability to restore so a specific point in time, and restoring to your latest full or differential backup that you have (you do know that you need to backup the vCenter databases right?) then changing the recovery mode from FULL to SIMPLE is just fine. However if you do need that ability (and I would recommend that you have that ability so that if the vCenter server fails or someone deletes data you can get back as much trending data as possible) then you will want to leave the recovery mode as FULL, but instead setup transaction log backups (hopefully you have a DBA or a database consultant that can help you with this) so that they run every few minutes (I typically recommend every 10-15 minutes). Just keep in mind that when selecting how often they should run in the event of a server failure you may loose up to that amount of data from the database. So if you aren’t aloud to loose more than 10 minutes worth of data, don’t backup the log every 15 minutes, back it up every 10, or maybe every 8 just to be safe. Make sure that you get in writing from your management that this is the config that they want so if you get yelled at you are covered.
If you haven’t ever backed up the transaction log your disk is probably pretty full (or totally full). In this case you will want to shrink the database just once. But before you do you need to know what size you should shrink it to. This can be much easier if a DBA does it, but I’ve got faith in you to not screw it up.
First connect to the SQL Server using Management Studio. Take a backup of the transaction log to disk by right clicking on the database, selecting Tasks from the context menu then selecting Backup from the pop out menu. Select transaction log backup, set a file to backup the log to (you’ll need a lot of disk space for this) and make sure that there’s only one file listed in the file list then start the backup. This will probably take a while.
If you don’t care about loosing the log for now change the database from full to simple recovery and back. Do this by running this script.
ALTER DATABASE vcdb SET RECOVERY SIMPLE GO ALTER DATABASE vcdb SET RECOVERY FULL GO
Now do a full backup as shown above (just do a full database backup instead of a transaction log backup).
Now with everything running as normal wait for how ever long it’ll be between transaction log backups. Then connect to the SQL Server in SQL Server Management Studio. Right click on the vCenter database (vcdb) and select Reports from the context menu, then “Disk Usage” from the list of reports. At the top it’ll give you the size of the transaction log in MB, and the graph on the right will show you a percentage breakdown of how full the transaction log file is. Take the Used percentage and double it. Now take that percentage and multiple it by the size of the transaction log. That’s the minimum size you should set your transaction log for. Personally I would round up to the 8 gig size above that size. If you need 150 Megs of log space, set the log for 8 Gigs in size. If you need 10 Gigs of log space, let the log for 16 Gigs of space. A little extra space used is fine, it is better for performance of the database if we don’t need to grow this transaction log. You can shrink the transaction log by using the DBCC SHRINKFILE statement as shown.
DBCC SHRINKFILE ('vbdc_log', 8388608)
Keep in mind that the number, in this case 8 Gigs, is passed in in megs. Do NOT for any reason create any sort of scheduled task to regularly shrink any database file. This will do nothing good and will cause performance problems of its own. If anyone has told you to setup a job which shrinks the database file regularly they are WRONG.
Going to the PASS summit is an amazing experience as you’ve probably noticed from all the wrap up posts that you’ve seen. However the downside to PASS is the sad reality that after the summit has ended you have to get back to normal work and your normal life. But that isn’t actually true. You can extend that awesome PASS feeling by attending other events like SQL Excursions, SQL Saturdays, etc. Were you can continue to get your SQL learning on, networking with other SQL Server professionals and having a good time while you enhance your career, improve your skills for your current role, have a great time and make new friends for life.
What ever you do, don’t make the SQL PASS experience a short one week a year experience, make it something that you do through out the year, every year. Your employer will thank you as you develop new skills to help them now, and you and your family will thank you as you work on moving your career to the next level.
For you managers reading this, don’t worry that your employees will jump ship when they get back from a conference. While some might, if they do they were going to leave anyway and not sending them wasn’t going to stop them. Odds are you’ll probably get some great leads for potential new SQL professionals when you send your people to conferences. I’ve even heard of people getting home from a conference with a new job, and giving their old employees a great referral for a replacement.
Hopefully everyone is able to slowly get back into regular life, until the next time we all meet.
Everyone has their favorite way or running a block of code over and over again. If you have to run an entire batch over and over again there is nothing as quick and easy as using the “GO n” syntax. When you put a number after the GO batch separator SQL Server Management Studio (as well as SQLCMD) will run that entire batch however many times that you have specified.
Recently I needed to create some dummy data in a table, so instead of writing a loop that counted the number of times the loop ran, I simply used code like this.
CREATE TABLE SampleData (id int identity(1,1), SomeData varchar(100)) go insert into SampleData (SomeData) values ('This is my sample data') go 100000
When run, the batch with the insert statement ran 100,000 times generating all the data that I needed with just 7 extra keystrokes.
Since I didn’t make it to the SQL PASS keynotes this morning, I figured that I’d tell you about another little event that happened late last night. Last night there was a SQL PASS first. There was a wedding here at the PASS summit. Tamera Clark (blog | @tameraclark) and Terry Tyler (blog | @airbornegeek) who are both DBAs, and first time attendees decided a while back that if they both were able to come to the SQL PASS summit that they would get married at the SQL PASS summit. This wasn’t exactly a rash decision on their part as they’ve been living together for many years. Conveniently enough I’m a registered minister and they asked me to perform the ceremony.
The wedding was very short and was held down by the water over looking Waterfront Park here in Seattle, WA just a short walk from the Washington State Convention Center and all our hotels. Microsoft sponsored the rehearsal dinner (there was a Microsoft sponsored Tweet Up that everyone in the wedding party went to) before hand. Confio was kind enough to provide the “bouquet” (it was a Confio umbrella that was given out to first timers, so it was new, borrowed as well as blue). And PASS was kind enough to sponsor the wedding reception (the annual PASS party over at Game Works) which the happy couple and several of the wedding guests went to.
I’ve got to say, this was definitely a first for the SQL PASS summit and for me and I’m so glad that they offered to let me be a part of their wedding. Pat Wright was kind enough to take some pictures before, during and after the wedding ceremony which either he or Tamara and Terry will be posting online somewhere when they get a chance. Please join me in congratulating the happy couple either here, on their blogs somewhere or on Twitter.
Hopefully next year the SQL PASS summit can be just as awesome as this years was (really is as it isn’t over yet). It sure as hell will be hard to beat this year.
Welcome to SQL PASS Summit Keynote on Day 2. Today was another day of announcements made by Quentin Clark who is the Corporate VP for Microsoft who is in charge of the SQL Server database engine.
Quentin started his talk about how SQL Server 2012 will improve the way that SQL Server provides many 9s of up time with the introduction of Always On. The StreamInsight feature now also includes an HA option which will allow you to keep the SteamInsight system processing all the time.
The first demo for todays keynote was my favorite SQL Server 2012 feature, Always On. In the demo the showed a three node cluster which hosted the production application with the cluster in two data centers. During the demo the presenter quickly and easily setup. After setting up a readable secondary the Availability Group was marked within the Always On routing configuration as a readable secondary and the reporting services reports were configured to automatically route their connections to the readable secondary in order to keep the reporting workload off of the production SQL Server. (I’ll be blogging more about this feature later.)
In the second part of the keynote Quentin talk about the ColumnStore indexes and how this feature will make running reports and data analysis much, much quicker. Quentin then talked about how PowerPivot has been enhanced, specifically when implemented along with SharePoint which will allow users to create their own reports within the IT controlled platform of SharePoint (which is great unless you don’t have / can’t afford SharePoint).
The next demo was for a set of features which are the Data Quality Services (DQS) and Master Data Services (MDS) features. DQS is a pretty cool feature which allows you to clean up data problems like incorrect lat/long based on the address of a location by using data from the SQL Azure data market place. During the demo there were a couple of stores which were located in the middle of the water around Seattle. By using DQS the stores location was able to be quickly and easily moved from the middle of the water to the correct location in Seattle.
Quentin then talked about the Fast Track program which allows hardware vendors to sell pre-packaged, pre-configured systems with SQL Server on them which can make it easier for customers to purchase their SQL Server servers. Another great program that Microsoft has is the appliance solution where you can get a server from the vendor up and running in just a couple of hours ready for data to be loaded into it.
Microsoft introduced the HP Enterprise Database Consolidation Appliance. This system is basically a private cloud appliance which is a rack of servers with 192 cores, 400 disk drives (supporting up to 60k SQL Server IOs) and 2 TB of RAM. This can be purchased as a half rack or a full rack appliance. As I understand this appliance this is effectively a Windows Hyper-V cluster where you can fire up VMs within the system quickly and easily. Where the system really saves you is the ability to deploy as a single unit quickly and easily without having to manually test all the components as HP will be testing and certifying that the hardware is working correctly before shipping the unit. From the support side of things you get a single phone number to call in order to get support for both the hardware as well as the software.
The next announcement for today was a traditional ODBC driver for the Linux platform in order applications running on non-Microsoft platforms that can connect to a SQL Server database without any real change to the application. Quentin also announced that Change Data Capture was now supported for SSIS as well as Oracle (not really sure what this means or how this works).
Semantic search was the next feature which was demoed. Semantic Search is used along with the new file table feature which can load files into the database through the file system, then Semantic Search can be used to read the files and not just make the values searchable but it can understand the data within the files so that you can find files based on specific searches but you can then find related documents to what you just found by using not just specific search terms but by using the original document to find the matches.
Another name change for this week is that project Juneau is now called SQL Server Data Tools.
You can now from within SQL Server Management Studio right click on a database and migrate the database directly to SQL Azure. This is done by creating a DACPAC (v2.0) and pushing it to the Azure with just a few clicks. Another Azure feature which is now supported is the ability to backup local or Azure database to Azure file storage by creating what is called a bacpac which is then copied to the Azure file storage. So for we haven’t seen a way to do this through T/SQL. You can also use this technology to backup your Azure database to the Azure file storage then use SSMS to connect to the file storage and restore the database to a local SQL Server instance for dev and testing.
SQL Server Azure is now supporting federations on the database side of the Azure platform in order to dynamically scale the database platform quickly and easily through the management portal without any application changes. With the next year end release of SQL Azure larger databases up to 150 Gigs will be supported as well as the ability to use a different collation (this is using contained databases).
Available today in CTP Microsoft has introduced the SQL Azure Reporting CTP as well as the SQL Azure Data Sync CTP. The reporting CTP allows you run SSRS reports from the SQL Azure cloud. The Data Sync CTP (which is the final CTP) allows you to easily sync data between a local instance and an Azure instance or from an Azure instance to a SQL instance at another cloud provider.
Hopefully you are as excited about some of these new features as I am, as some of this stuff just looks really, really cool.
Today was Day 1 of the SQL PASS summit. Today there were a few announcements during the keynotes which if you didn’t see on the live streaming site you probably want to hear about.
During the PASS portion of the keynote we also heard about the massive amount of training the the SQL PASS organization provides to the community. PASS has counted over 430,000 hours of training which has been delivered to PASS members. That number will sky rocket over the next three days as people go through the SQL PASS summit sessions.
There was a great shout out during the keynote for the SQL MVP Deep Dives book which was officially released at the PASS Summit. This book, which if you aren’t aware of was written by 55 different SQL Server MVPs with all the profits going to Operation Smile.
The first and probably most important announcement was the official name of the SQL Server “Denali” product which is officially named SQL Server 2012 and will be released in the first half of the year 2012. These announcements as well as the rest of the keynote was given by Ted Kummert the Senior VP for Business Platform Tools (aka. Bus Plat) of the SQL Server product at Microsoft.
There was a lot of talk from Ted about the Azure cloud which is a big change from the on premise installations of SQL Server that Microsoft has been selling for the last 20 years. The reason for the big push is that a lot of customers are moving some data up to Azure. Now don’t worry this doesn’t mean that on site installs are going away. Microsoft is working hard on great new features which will be coming out in the future releases of the SQL Server product that we know and love. Azure simply gives you a additional choice besides building and running systems on site so that you can be more flexible and so that you don’t need to worry about things like hardware and patching which is all taken care of for you in the Azure cloud.
If you are a used of “Project Crescent” this has been officially named “Power View” (I’ve been asked to tell you that there is a space in the name). It was also hinted during the keynote that more BI tools will be introduced to SQL Azure in 2012. What flavor this comes in is any ones guess, but it’ll be an interesting thing to see for sure.
Ted also spoke specifically about Big Data and some of the directions that people will be going with big data. Companies today simply don’t throw any data away, and some times that data is unstructured and that’s really the best way of using the data. Microsoft will be getting into this market more with the distributed compute platform which means that Microsoft will be supporting Hadoop as a part of the data platform. This means that you’ll be able to run Hadoop on Microsoft Windows as well as on the Azure platform.
As of last week you can get the Apache Hadoop connector for SQL Server and the PDW platform so that you can connect SQL Server to Hadoop. Future releases include the Hadoop based distribution that I spoke about above, as well as an ODBC driver and add-in for Excel and office to make it easier for people to get Hadoop data into the office platforms. I think that this is really a big game changer as this allows for a very different data storage platform than they are used to with traditional databases platforms. Microsoft will have a CTP version of their Hadoop platform available on SQL Azure before the end of the 2011 year.
Microsoft announced “Data Explorer” which allows users to do self service BI without realizing that they are doing self service BI. This allows users to easily see and read the data, most importantly taking the data and turning it into information that they can use to drive the company quickly and easily. I’m not going to bore you with anything about this demo as it sucked. They pretty much lost the audience within a couple of minutes.
Don’t forget to check back tomorrow as there will be more great announcements from Microsoft during tomorrow’s keynote.
The SQL PASS Summit is getting ready to begin. If you are attending the SQL PASS summit I hope that you have already arrived in Seattle so that you can have some fun tonight at the events before getting started at the conference tomorrow.
Things you can look forward to tonight at the PASS summit include the welcome reception in the convention center at 6:30pm in 6ABC (also called the Dell Ballroom). If you are a first timer there is the first timers session tonight before the welcome reception at 5:30 or 6pm.
If you weren’t able to make it to the SQL PASS summit this year, don’t forget to check out the great keynotes each morning at 8:15am pacific time which are being streamed live on the web.
While doing some research for my Half Day SQL PASS Session with Stacia Misner (blog | @staciamisner) I discovered that some of the queries which the SSRS engine runs against the ReportServer database are less than perfect total crap. My specific problem is that there are key lookups on the bulk of the queries.
Granted most of the queries results are pretty small, but these queries run every 10 seconds, and crap running every 10 seconds adds up to a whole lot of crap.
Below are some indexes you can create to resolve these problems. Keep in mind that adding this will probably make Microsoft not help you if you call, so be sure to delete the indexes if you call CSS for support.
These indexes will minimize the index scans and key lookups. I can’t remove all of them because of the way that the queries and the schema are designed and I’m not about to go around changing the schema of the tables or hacking the inline code of the SSRS UI. So while these indexes won’t fix every problem, some is better then none.
CREATE NONCLUSTERED INDEX [mrdenny_IX_Notifications3] ON [dbo].[Notifications] ( [NotificationEntered] ASC, ProcessStart, ProcessAfter ) include (NotificationID) GO CREATE INDEX mrdenny_PK_Catalog on dbo.Catalog (ItemID) include (Path, Type, PolicyId) GO CREATE INDEX mrdenny_IX_Event_TimeEntered on dbo.Event (ProcessStart, TimeEntered) include (EventID) GO CREATE INDEX mrdenny_IX_BatchID on dbo.Event (BatchID, TimeEntered) include (EventID, EventType, EventData) with (drop_existing=on) GO CREATE INDEX mrdenny_IX_UpgradeID on dbo.ServerUpgradeHistory (UpgradeID DESC) include (ServerVersion) GO
The SQL PASS summit is quickly approaching, and with the excitement of large conferences comes the worry of what to do during the week. How am I going to meet people, what am I going to do after hours or during the day when there is a session that I don’t want to attend?
This is a question that myself and two good friends Andy Warren (blog | @sqlandy) and Steve Jones (blog | @way0utwest) were kicking around last night (it’s amazing what you can do over email when everyone is in a different time zone) and we’ve come up with a solution that will hopefully work for everyone, and the best part is that using it is free. If you’d like to go somewhere for dinner, post it up on eventbrite.com as a free event for people to attend with a limited number of attendees (if your event is free, then Event Bright doesn’t charge). Then to get yourself a little free publicity post the info up on the passsummitevents.info website. Once the event is approved (which I’ll try and do as quickly as humanly possible) it’ll be up on the site and it’ll be blasted out to the world under the #sqlfun hash tag by the SQL_Fun twitter account. (At the moment the links aren’t working on mobile browsers, I’m working with the developer of the site to get that fixed before the summit.)
The Event Brite info is great because it gives you a way to easily limit the attendee list, because as Andy points out in his blog post (which is much more professionally written than this one) “It’s a lot easier to get a table for four than a restaurant for 150.”
There are lots of big events that happen at the PASS summit such as the Welcome Reception, GameWorks party, SQL Photo Walk, SQL Karaoke (most of these are posted up passummitevents.info already) and the various vendor parties (if you are lucky enough to get an invite to those). What we are talking about with this setup is more informal smaller get togethers. If you want to go down to Pikes Place Market for Breakfast on Tuesday (which I highly recommend you do at least once while you are in Seattle) this would be a great thing to get out. You’ll find some great people to talk to and have some company for breakfast.
Andy has started everyone off with Donuts & Coffee with SQLAndy/Tuesday Morning, so if you are interested go and check it out. And if not then think about throwing something up there and I’m pretty sure that you’ll find some people who are interested in joining you.
So go and put a dinner or a morning walk around town up on the site, and see who responds. Personally I would love it to see 100+ items up there before the summit starts next week.
Now if this is your first time at the PASS summit, or your first time in Seattle at all then be sure to check out the first timers video that I recorded a few weeks ago. In it I’ve got some ideas for things to do around town that don’t require driving, drinking, or thinking about SQL Server.
See you in Seattle,