Welcome my friends to the first official work day of the year 2013. You enjoying that hangover yet?
It is time again to take a little peak at the last year (as far as my blog goes) and see just what was the most popular posts, and what just sucked so that hopefully for this year I can do better. (Sadly do to come reporting issues I don’t have all the data for 2012. I’m missing a chunk in the middle for a few months, so the numbers are actually higher than they are, but I’ll just have to work with the data that I have.)
The posts that were read the most in 2012 were…
10. Single Instance vs. Multiple Instances – 4.4k views
9. Back to Basics: The BACKUP DATABASE command – 5.4k views
8. New INSERT syntax in SQL Server 2008 – 5.9k views
7. SQL Server 2012 Licensing Changes – 7.6k views
6. Difference between an Index and a Primary Key – 8.7k views
5. What exactly is MSDTC, any when do I need it? – 11.5k views
4. Upgrading Windows 8 RP to RTM – 11.6k views
3. Back To Basics: The UPDATE Statement – 14.7k views
2. How to configure DTC on Windows 2008 – 18.5k views
1. Back To Basics: Clustered vs NonClustered Indexes; what’s the difference? – 24.1k views
This gives some interesting insight into what people are reading off on my blog. If you compare to last years top 10 blog posts you see that it is a lot of the same blog posts. In fact of the top 10, only 2 are different from 2011 to 2012.
#1 – How To Configure DTC on Windows 2008
#2 – Back To Basics: The Update Statement
#3 – Back To Basics: Clustered vs. NonClustered Indexes
#4 – What Exactly Is MSDTC And When Do I Need It
#5 – New Insert Syntax in SQL Server 2008
#6 – Difference Between an Index And a Primary Key
#7 – SQL Server 2012 Licensing Changes
#8 – If You Aren’t Doing Backups You Aren’t Doing Your Job
#9 – Single Instance vs. Multiple Instance
#10 – You Can’t Deploy A SQL 2008 SSRS Report to a SQL 2005 SSRS Server
In both my 2012 top posts and my 2011 top posts only 2 are different. And only one of the two new posts from the 2012 list were written in 2012. The one written in 2012 is about upgrading the Windows 8 RP to RTM and was cross posted all over the place. The second new entry to the top 10 list is in the #10 spot and was actually written back in 2008, but the topic is an oldie but a goodie … database backups.
So back to the insight that this gives me. I find it interesting that so many people viewed my posts on configuring MS DTC (there are two on the list). There is also a lot of intro level material on that top 10 list from 2012 as well which means that in addition to writing about complex topics the more intro level stuff is definitely stuff that lots of people are interested in.
Hopefully you’ll keep reading my blog though 2013 (as beyond).
Have a great new year,
This week I’ve found some great things for you to read. These are a few of my favorites that I’ve found this week.
- One example of query performance problem due to data type conversion
- How can you tell if an index is REALLY a duplicate?
- Portfolio changeover time for the PASS board
- My Short Struggle With SQL Agent
- Steps to create Integration Services Catalogs and deploy SSIS package in SQL Server 2012 SSISDB
- This weeks SQL Server person to follow on Twitter is:
Hopefully you find these articles as useful as I did.
Don’t forget to follow me on Twitter where my username is @mrdenny
Cross database chaining in SQL Server is actually a fairly old feature, first introduced in SQL Server 2000 SP3. However this feature isn’t often understood mostly because it isn’t often used.
Database chaining is when permissions cascade from one object to another because they are used by the parent object. The perfect example is a stored procedure which accesses a table. The user only needs rights to the parent object (the stored procedure) and the rights to access the table exist automatically because the stored procedure accesses the child object (the table).
Cross database chaining uses this exact same concept except that the parent object is in one database and the child object is in another database. In order to use cross database chaining the feature needs to be enabled on both databases. This is done by using the ALTER DATABASE statement as shown below on both databases.
ALTER DATABASE A_Database SET DB_CHAINING ON
Once this is done, the login which is mapped to the user within the database which has the parent object needs to be mapped to a login within the database which has the child object. The user within the database which owns the child object doesn’t need any specific rights other than to be a member of the public role. Once this is done the cross database permission chain will be made and the stored procedure (or other parent object such as a trigger or function) will begin working.
A client came to be with a pretty scary problem. Their SharePoint site was totally down. The databases were there but the actual SharePoint farm was totally dead. All of the SharePoint tools said that the farm wasn’t configured yet so there were some pretty big problems. Thankfully they had a second SharePoint farm available so we decided to attempt to move the stuff from the broken farm to the new farm.
The first step was to backup the SQL Server database and restore it to the new SQL Server. No problems there, just a normal SQL Server backup and restore operation (just make sure that you don’t restore over an existing database). The next thing to do was to get the data from the SQL Server database that SharePoint didn’t know anything about into a SQL Server database that SharePoint did know about. This is done by going into Central Administration and selecting the “Backup and Restore” option. Then within the “Granular Backup” section selecting “Recover data from an unattached content database” as shown below.
Now go back into the Create
On the next screen give SharePoint the server, database and authentication information for the database you just restored. I then selected the “Export site or list” radio button at the bottom and clicked “Next”. The next screen allows you to select specific sites to export. In my case I wanted everything so I changed the “Site” drop down menu to “No selection” and entered a filename. I didn’t bother with security as we were changing Active Directory domains anyway. I left the default option of exporting all versions so that any version history would be maintained and clicked the “Start Export”.
Depending on how large the content database is you may have time for coffee, dinner or to take the weekend off.
When it was finished I discovered some really big annoyances. The biggest being that the template used for the old SharePoint farm was different from the template for the new SharePoint farm. This meant that I wasn’t able to import the data directly like I wanted to. The new farm was created using the template “SPSMSITEHOST#0” while the old farm was done using the template “ENTERWIKI#0” (I found this out when I tried to use the import-spweb powershell cmdlet to import the data).
So in order to get the data imported as quickly as I could the solution that I came up with was to create new Site Collections for the 2 sites that needed to be restored. To do this you go into Central Administration and select “Application Management” from the menu on the left. Then under “Site Collections” select “Create Site Collection”. Now on the left half of the screen in the “Web Site Address” section you can click “Define Managed Paths”, do that and create a managed path for the URL you want, in this case /hr. The type of path should be Explicit inclusion as shown below.
After clicking Add Path then OK go back into the “Create Site Collection” screen. On this screen give the site collection a name, in this case “Human Resources” (you can imagine that HR wanted this back up quickly). Select /hr (or whatever you just created) in the URL field. For the template either select the template if you know what it is, or select the custom tab and choose “< Select template later… >”. Set the primary and secondary admins and the quota template as needed and click OK.
You’ve now prepped the production SharePoint site to accept the data. Now comes the really annoying part. There’s no way to import (that I could find) just part of the site from the backup file as I didn’t want a site collection but rather a subsite. In order to get just a subsite into the backup file I had to create a temporary web application (another TCP listener) for SharePoint that was separate from the main site. This is also done from Central Administration by clicking on “Application Management” from the left hand menu. Then on the right under “Web Applications” click the “Manage web applications” link. From here you can create a new Web Application by clicking the new button at the top left of the screen. Give it the TCP port and the IIS web site name as needed with the needed security information based on your companies policies. You’ll then need to go into Application Management and under Databases use the “Manage content databases” link to add a content database to the new Web Application. You can then restore the entire old SharePoint site to this new temporary Web Application by using the import-spweb powershell cmdlet as shown below.
import-spweb http://site:port -Path ‘c:\backups\YourBackupFile.cmp’ -force
Once that is done we can now backup just the specific sites that we want to move. Go back into Central Administration and select “Backup and Restore”. Under “Granular Backup” click the “Export a site or list” link (you can see it in the first screenshot above). In the backup screen select the Site Collection and Site that you want to copy to the new farm, specify a filename to backup to (I used c:\denny_backups\hr\hr.cmp) and click “Start Export”. Once it’s finished use import-spweb to import the data into the production site as shown below.
import-spweb http://site/hr -Path ‘c:\denny_backups\hr\hr.cmp’ -force
Repeat this last paragraph for each site that you need to move (and that you have already created site collections for.
Needless to say when this was all done the client was happy because everything was back up and running (after doing a little bit of DNS changing) so all and all it was a successful day.
As we all know by now AlwaysOn Availability Groups are an enterprise edition feature and SQL Server Clustering is a standard edition feature. Butt what happens when you have a small business that is running its apps on SQL Server Express. Can’t SQL Express have any sort of high availability?
Officially the answer is no, however with a little bit of creative configuration you sure can.
The Overall Environment
To setup SQL Server Express in a Windows Cluster I’m building this on a two node Windows Server 2012 cluster using a file share hosted on my domain controller to host the actual databases. To ensure that the domain controller is rebooted as little as possible the domain controller is installed in core mode. The cluster nodes are Windows Server 2012 standard edition (which now supports clustering) as is the domain controller.
As SQL Server 2012 express edition doesn’t support Windows Clustering out of the box the installation will be a little different from doing a normal clustered install under standard or enterprise edition. To install I did a normal SQL Express install on node1. The only change from a normal install that I made was that I configured the SQL Server instance to start under a domain account. When I got to the data directories part I configured the data folder to a network share on the domain controller.
Once the installation on node1 was completed I stopped the SQL Server services. Then I renamed the folder that I installed the SQL Server database files into. The reason for this is that I need to configure the second instance to put the database files into the same location. I can then install SQL Server 2012 express edition onto the second node.
The installation on node2 is done exactly like it was done on node1.
Once the installation is done on both nodes configure the SQL Server service to have a startup type as “Manual” instead of disabled or automatic. Leave the SQL Agent service as disabled as even though SQL Express installs the SQL Agent the SQL Agent isn’t supported on SQL Express.
Once the installation on Node2 is done the cluster can be configured. To do this bring up the Failover Cluster Administrator on one of the nodes and connect to the cluster. If the cluster hasn’t been configured yet run through the normal Clustering Configuration wizard.
We’ll now configure a new cluster role on the cluster. To do this right click on “Role” then select “Configure Role” from the context menu as shown below.
When the wizard opens click next to get to the list of services. Then select the Generic Service item from the list as shown below.
On the next screen you’ll be asked what service you wish to cluster. From this list select the SQL Server service as shown below.
On the next screen you’ll be asked to name the resource group. Give the group a name which is unique on the domain and click next. The next screen will ask you to select the needed storage. Simply click next on this screen as we aren’t using any local shared storage. The next screen asks you if any registry settings need to be replicated between the machines. We don’t need to replicate anything as SQL Server doesn’t make much use of the registry for the actual SQL Server service so we can simply click next on this screen as well. The next screen is simply a screen to review the changes which will be made. You can simply click next on this screen after reviewing the information on the screen. When the summary screen displays click finish.
Post Clustering SQL Config Changes
The first change that you’ll need to make is to enable the TCP network protocol on both nodes. By default SQL Express has the TCP network protocol disabled which need to be corrected before uses will be able to connect to the SQL Server service.
The next change that you’ll need to make is to change the local server name in the master database from the name of the last node which was installed to the cluster name using a script similar to the one shown below. In the case of this script the nodes are named node1 and node2 and the cluster name is clustersql. Once this script has been run the SQL Server instance should be restarted or failed over to the other node.
exec sp_dropserver ‘nodeb’
exec sp_addserver ‘clustersql’, local
At this point the cluster is up and running and applications can have their databases configured on the SQL Server Instance.
I’m so happy to be able to announce that I’ll be giving a precon at SQL Saturday 194 over in England on Friday March 8th, 2013. This precon will be will be on SQL Server Security which is a topic which I’m quite familiar with.
During this full day precon we will review a variety of ways to secure your SQL Server databases and data from attach. In this session we will review proper network designs, recommended firewall configurations, and physical security options. We will also review our data encryption options, password protection options, using contained databases, and AlwaysOn Availability Groups security. There will also be discussions about additional measures which should be taken when working with Internet facing applications.
From there we will move to the most dangerous attack vector SQL Injection including all the ways that attackers can use SQL Injection to get into your system and how to protect against it. The security options for database backups is the next topic on the list followed by proper SAN security designs. We will then finish up by reviewing the auditing options which are available and how they can be used to monitor everything else which we’ve just talked about during the course of the day.
A pretty cool feature that was introduced in SQL Server 2008 R2 SP1 is the ability to change the auto stats algorithm from the default of 20%+500 rows to a sliding scale. This feature is only available by turning on trace flag 2371 so it won’t be turned on by default for you.
When turning on this trace flag because you’ve got a large database the goal obviously is to use autostats so you’ll need to have auto stats turned on. In addition you’ll want to turn on the “Auto Update Statistics Asynchronously” setting for the database or databases which hold the super large table. The reason that you’ll want to turn on the auto stats async update feature is that if you don’t you may see queries time out when auto stats starts to kick in.
Auto stats as we all know update the statistics when the correct number of rows in the table have changed. When you’ve got very large tables that then trigger auto stats to run if it takes more than 30 seconds for the update stats command to run the query that triggered the auto stats to time out, which causes that transaction to roll back, which means that the auto stats command will also roll back. So the next query will then fire the auto stats update and the process will repeat over and over.
The symptoms that you’ll see on the SQL Server are queries which are timing out at random even though the execution plan looks totally normal. You’ll also see a massive amount of IO being generated on the disks which are hosting the database as auto stats does a lot of querying of the table and as auto stats is running over and over you’ll be thrashing the hard drive pretty quickly.
When you turn on the async auto stats setting on the database when the auto stats is triggered by the SQL Server the query in question doesn’t have to wait for the update stats command to finish. Instead the update statistics command runs in the background letting the query continue to run as normal. Now the query will run using the old statistics which is probably OK in this case as they were ok 2 seconds earlier so if they are used for a few more seconds it probably isn’t all that big of a deal.
Now I don’t recommend turning this setting on for every database on the server. All of the smaller databases will update statistics just fine within the timeout period.
So the table that I ran across in my case where I had to turn this setting on took over 2 minutes in order to manually run update statistics on the table so using synchronous statistics updates via auto stats was basically useless.
Altaro Software, a fast-growing developer of backup solutions for Microsoft Hyper-V, today announced that it is giving every Microsoft Hyper-V administrator 50 free licenses of Oops!Backup, their desktop backup solution.
“Following the success of our Hyper-V Backup solution this year, we wanted to give something back to the Hyper-V community during the holiday season” commented David Vella, CEO of Altaro. “Hyper-V admins can give out these licenses to their colleagues, friends and family, for use at work or at home.”
Oops!Backup is a popular desktop backup solution that allows users to preview & restore versions of their files from different points in time.
Any network administrator who uses Microsoft Hyper-V is eligible for the free license keys, they simply need to visit the Altaro website, send in a screenshot of their Hyper-V Manager and expect an email with their respective keys.
To claim the 50 free licenses users should check out http://www.altaro.com/hyper-v/50-free-pc-backup-licenses-for-all-hyper-v-admins
More information about Oops!Backup is available at http://www.altaro.com/home-pc-backup/
For more info about Altaro Hyper-V Backup check out http://www.altaro.com/hyper-v-backup/?LP=Xmas
Altaro’s Hyper-V portal is available at http://www.altaro.com/hyper-v/
Note: Giveaway expires on Monday December 24th. Licenses are Not-For-Resale (NFR) keys.
Founded in 2009, Altaro Software is a fast-growing developer of user-friendly backup solutions aimed at SMBs. Based in Malta, Europe. Altaro’s products including Oops!Backup, Backup FS and Hyper-V Backup are available worldwide from Altaro and its network of value added resellers. For further information please visit www.altaro.com
Some of Altaro’s 9,000+ Customers Include:
Harvard University, New Zealand Defence Force, Max-Planck Institute, Los Alamos National Laboratory, Princeton University, US Geological Survey, SRI International, Swedish Defence Research Agency, Mitsubishi Polyester Film, University of Delaware, University of Miami and many more.
As database professionals there isn’t much of a reason that we actual need to go into the office, other than to make other people in the office feel better because we are actually there in person. In practice we don’t ever need to actually touch the physical hardware. Meeting can easily enough all be done via phone, video chat, etc. (trust me, I do this all the time). While it does take a little getting used to when it comes to using web conferencing and conference calls, but it really isn’t all that hard, I promise.
Working from home has some pretty big benefits when it comes to productivity. The main one that I’ve seen in my years of working from home is that my productivity when working from home goes through the roof because people can’t just drop by to chat about whatever. While they can still call or message on Skype, MSN, Yahoo, AIM, etc. those can be easily enough ignored if it isn’t something important. Without the drive by conversations it’s amazing how much more I am able to get done during the day. When it comes to getting from one meeting to another you can kill a lot of time just walking from one conference room to another. At home I just hang up the phone and dial into the next conference bridge.
From a technical perspective everything that we data folks do is remote anyway. We log onto servers using SSMS from our desktops. Even if we RDP into a server to patch it or do some “local” work, we are still remote. I’ve used remote desktop to connect to servers from WiFi in a hotel bar in Amsterdam so that I could get some work done. Once you are used to not needing to physically touch the servers you are golden.
There are some financial benefits for the companies as well. On average a standard cubical is about 8’x8′ or 64 square feet or space, per employee. Every month those 64 square feet of space need to be paid for. Figure it costs a couple of dollars per square foot for office space (I’m just guessing here, and your costs will vary widely depending on where you live) that’s a couple hundred dollars per employee, per month just to have me in the office. That doesn’t include the power for my computers, AC to keep me comfy, etc. And all of this is just to keep people happy because they can see me in the office. If it’s that important have people come in one day a week for meetings, working from the conference room or some sort of shared space which can be used for other stuff while the remote workers aren’t actually in the office.
As we hopefully all remember, happy workers stick around longer and are much more productive. If we can make our employees more productive and not spend any money doing it (and save the employee money on things like gas) why shouldn’t we?
During TechEd Europe I got a very scary phone call. A very large table was throwing errors that the value being inserted into the primary key column was overflowing the data type that makes up that column. In this case the data type in question was INT, so we were trying to stick the number 2,147,483,648 into the column and that number just doesn’t fit. The error that we got looked something like this:
System.Data.SqlClient.SqlException: Error 8115, Level 16, State 1, Procedure SomeStoredProcedure, Line 103, Message: Arithmetic overflow error converting IDENTITY to data type int. Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Ourshort term solution was quick and easy, to change identity seed of the column from 0 to the lowest possible number, in this case -2,147,483,648 which got us back up and running pretty quickly. This however isn’t a very long term solution for this system. The application is only about 3 or 4 years old, and has been growing very quickly over the last year or so. We estimated based on the last few weeks worth of data that we would run out of negative numbers within about 12 months at the most. We sent an estimate of 9 months to the business when we advised them that the system was back up and running. We also told them that we wanted this fixed within 5-6 months to be save because if we didn’t get this fixed before running out of negative numbers there wouldn’t be any short term fix and that we’d be looking at a multi-day outage to fix the problem.
We couldn’t just rush into a database side fix, as the fix in this case is to change the data type from INT to BIGINT. As the application does use this column in a couple of places the .NET application needed to be reviewed to ensure that anything that was looking for an INT was corrected to handle the BIGINT correctly.
Based on the amount of data within the table (about 300 Gigs) it was decided that taking an outage to make the change in place wasn’t really an option as doing the size change in place would require somewhere around a 5 day outage to remove and rebuild all the non-clustered indexes. To make things a little more complex there is a second table which has a 1=1 relationship with this table, and the second table is even larger (about 800 Gigs), though thankfully the second table doesn’t have any non-clustered indexes.
The solution that was decided on was to move the data from the current table to a table with a BIGINT data type while the application was running. To do this meant that we needed to get all the data copied over to the new table and in sync while the old table was being used. It was decided that the easiest way to do this would be to use triggers. In this case instead of one complex trigger to handle all insert, update and delete operations three separate triggers were used for each of the two tables. First I created two new tables, with the exact same schema as the old tables, with the exception that the new tables used the bigint data type for the primary key instead of the int data type for the primary key and the new table was setup with the ident being the next available positive number. Once that was done the triggers were setup. The insert trigger is pretty basic. Take the data that was just loaded and stick it into the new table.
CREATE TRIGGER t_MyTable1_insert ON dbo.MyTable1
SET IDENTITY_INSERT MyTable1_bigint ON
INSERT INTO MyTable1_bigint
(Col1, Col2, Col3, Col4…)
SELECT Col1, Col2, Col3, Col4
SET IDENTITY_INSERT MyTable1_bigint OFF
The update and delete triggers required a little more logic. The trick with the triggers was that I needed to avoid doing massive implicit data conversions. In order to ensure that SQL was doing what I wanted (which it should be doing anyway, but it made me feel better doing explicit conversions) I explicit conversions into place for the JOIN predicates as shown. The update trigger is shown first, then the delete trigger.
CREATE TRIGGER t_MyTable1_update ON dbo.MyTable1
SET MyTable1_bigint.[Col2] = inserted.[Col2],
MyTable1_bigint.[Col3] = inserted.[Col3],
MyTable1_bigint.[Col4] = inserted.[Col4],
WHERE cast(inserted.Col1 as bigint) = MyTable1_bigint.Col1
CREATE TRIGGER t_MyTable1_delete ON dbo.MyTable1
DELETE FROM MyTable1_bigint
WHERE Col1 in (SELECT cast(Col1 as bigint) FROM deleted)
Once these tables were up and running all the new data changes were being loaded into the table. At this point it was just a matter of coping the existing data into the table. There are a few ways that this can be done. In this case I opted for an SSIS package with a single data pump task, and two data transformations within the data pump task with one transformation for each table.
In order to make the load as fast as possible I used the fast load option and loaded 1000 rows at a time. Within the data task if there was an error I redirected the rows to another data pump task which simply dropped the rows into the same table, but this time going row by row. Any failures from that import were simply ignored. While handling failed rows like this is time consuming it is easier than running T-SQL scripts to verify which rows are needed and which rows aren’t needed. SSIS also gives an easy option to ignore the foreign key relationship between the two tables so if the child table gets rows first that isn’t a problem as we know that the parent table will catch up. The SSIS package looked like this:
When all is said and done and the data is in sync between the new and old tables, the current tables will be dropped and the new tables will be renamed and put into place so that the application can continue to run without issue, with just a few minutes of downtime.
So why did this happen? When the applications database was being designed the developers didn’t think about how many rows the database was going to get over time, so they didn’t account for needing to support more than 2.1 billion rows over time. If I (or another data architect) had been involved in the project at it’s start this hopefully would have been caught at design time. However when the application was first being designed the company was brand new and didn’t have the funds for a data architect to help with the application design so this problem was missed.
Hopefully you never hit this problem, but if you do this helps you get out of it.