As we all know by now AlwaysOn Availability Groups are an enterprise edition feature and SQL Server Clustering is a standard edition feature. Butt what happens when you have a small business that is running its apps on SQL Server Express. Can’t SQL Express have any sort of high availability?
Officially the answer is no, however with a little bit of creative configuration you sure can.
The Overall Environment
To setup SQL Server Express in a Windows Cluster I’m building this on a two node Windows Server 2012 cluster using a file share hosted on my domain controller to host the actual databases. To ensure that the domain controller is rebooted as little as possible the domain controller is installed in core mode. The cluster nodes are Windows Server 2012 standard edition (which now supports clustering) as is the domain controller.
As SQL Server 2012 express edition doesn’t support Windows Clustering out of the box the installation will be a little different from doing a normal clustered install under standard or enterprise edition. To install I did a normal SQL Express install on node1. The only change from a normal install that I made was that I configured the SQL Server instance to start under a domain account. When I got to the data directories part I configured the data folder to a network share on the domain controller.
Once the installation on node1 was completed I stopped the SQL Server services. Then I renamed the folder that I installed the SQL Server database files into. The reason for this is that I need to configure the second instance to put the database files into the same location. I can then install SQL Server 2012 express edition onto the second node.
The installation on node2 is done exactly like it was done on node1.
Once the installation is done on both nodes configure the SQL Server service to have a startup type as “Manual” instead of disabled or automatic. Leave the SQL Agent service as disabled as even though SQL Express installs the SQL Agent the SQL Agent isn’t supported on SQL Express.
Once the installation on Node2 is done the cluster can be configured. To do this bring up the Failover Cluster Administrator on one of the nodes and connect to the cluster. If the cluster hasn’t been configured yet run through the normal Clustering Configuration wizard.
We’ll now configure a new cluster role on the cluster. To do this right click on “Role” then select “Configure Role” from the context menu as shown below.
When the wizard opens click next to get to the list of services. Then select the Generic Service item from the list as shown below.
On the next screen you’ll be asked what service you wish to cluster. From this list select the SQL Server service as shown below.
On the next screen you’ll be asked to name the resource group. Give the group a name which is unique on the domain and click next. The next screen will ask you to select the needed storage. Simply click next on this screen as we aren’t using any local shared storage. The next screen asks you if any registry settings need to be replicated between the machines. We don’t need to replicate anything as SQL Server doesn’t make much use of the registry for the actual SQL Server service so we can simply click next on this screen as well. The next screen is simply a screen to review the changes which will be made. You can simply click next on this screen after reviewing the information on the screen. When the summary screen displays click finish.
Post Clustering SQL Config Changes
The first change that you’ll need to make is to enable the TCP network protocol on both nodes. By default SQL Express has the TCP network protocol disabled which need to be corrected before uses will be able to connect to the SQL Server service.
The next change that you’ll need to make is to change the local server name in the master database from the name of the last node which was installed to the cluster name using a script similar to the one shown below. In the case of this script the nodes are named node1 and node2 and the cluster name is clustersql. Once this script has been run the SQL Server instance should be restarted or failed over to the other node.
exec sp_dropserver ‘nodeb’
exec sp_addserver ‘clustersql’, local
At this point the cluster is up and running and applications can have their databases configured on the SQL Server Instance.
I’m so happy to be able to announce that I’ll be giving a precon at SQL Saturday 194 over in England on Friday March 8th, 2013. This precon will be will be on SQL Server Security which is a topic which I’m quite familiar with.
During this full day precon we will review a variety of ways to secure your SQL Server databases and data from attach. In this session we will review proper network designs, recommended firewall configurations, and physical security options. We will also review our data encryption options, password protection options, using contained databases, and AlwaysOn Availability Groups security. There will also be discussions about additional measures which should be taken when working with Internet facing applications.
From there we will move to the most dangerous attack vector SQL Injection including all the ways that attackers can use SQL Injection to get into your system and how to protect against it. The security options for database backups is the next topic on the list followed by proper SAN security designs. We will then finish up by reviewing the auditing options which are available and how they can be used to monitor everything else which we’ve just talked about during the course of the day.
A pretty cool feature that was introduced in SQL Server 2008 R2 SP1 is the ability to change the auto stats algorithm from the default of 20%+500 rows to a sliding scale. This feature is only available by turning on trace flag 2371 so it won’t be turned on by default for you.
When turning on this trace flag because you’ve got a large database the goal obviously is to use autostats so you’ll need to have auto stats turned on. In addition you’ll want to turn on the “Auto Update Statistics Asynchronously” setting for the database or databases which hold the super large table. The reason that you’ll want to turn on the auto stats async update feature is that if you don’t you may see queries time out when auto stats starts to kick in.
Auto stats as we all know update the statistics when the correct number of rows in the table have changed. When you’ve got very large tables that then trigger auto stats to run if it takes more than 30 seconds for the update stats command to run the query that triggered the auto stats to time out, which causes that transaction to roll back, which means that the auto stats command will also roll back. So the next query will then fire the auto stats update and the process will repeat over and over.
The symptoms that you’ll see on the SQL Server are queries which are timing out at random even though the execution plan looks totally normal. You’ll also see a massive amount of IO being generated on the disks which are hosting the database as auto stats does a lot of querying of the table and as auto stats is running over and over you’ll be thrashing the hard drive pretty quickly.
When you turn on the async auto stats setting on the database when the auto stats is triggered by the SQL Server the query in question doesn’t have to wait for the update stats command to finish. Instead the update statistics command runs in the background letting the query continue to run as normal. Now the query will run using the old statistics which is probably OK in this case as they were ok 2 seconds earlier so if they are used for a few more seconds it probably isn’t all that big of a deal.
Now I don’t recommend turning this setting on for every database on the server. All of the smaller databases will update statistics just fine within the timeout period.
So the table that I ran across in my case where I had to turn this setting on took over 2 minutes in order to manually run update statistics on the table so using synchronous statistics updates via auto stats was basically useless.
Altaro Software, a fast-growing developer of backup solutions for Microsoft Hyper-V, today announced that it is giving every Microsoft Hyper-V administrator 50 free licenses of Oops!Backup, their desktop backup solution.
“Following the success of our Hyper-V Backup solution this year, we wanted to give something back to the Hyper-V community during the holiday season” commented David Vella, CEO of Altaro. “Hyper-V admins can give out these licenses to their colleagues, friends and family, for use at work or at home.”
Oops!Backup is a popular desktop backup solution that allows users to preview & restore versions of their files from different points in time.
Any network administrator who uses Microsoft Hyper-V is eligible for the free license keys, they simply need to visit the Altaro website, send in a screenshot of their Hyper-V Manager and expect an email with their respective keys.
To claim the 50 free licenses users should check out http://www.altaro.com/hyper-v/50-free-pc-backup-licenses-for-all-hyper-v-admins
More information about Oops!Backup is available at http://www.altaro.com/home-pc-backup/
For more info about Altaro Hyper-V Backup check out http://www.altaro.com/hyper-v-backup/?LP=Xmas
Altaro’s Hyper-V portal is available at http://www.altaro.com/hyper-v/
Note: Giveaway expires on Monday December 24th. Licenses are Not-For-Resale (NFR) keys.
Founded in 2009, Altaro Software is a fast-growing developer of user-friendly backup solutions aimed at SMBs. Based in Malta, Europe. Altaro’s products including Oops!Backup, Backup FS and Hyper-V Backup are available worldwide from Altaro and its network of value added resellers. For further information please visit www.altaro.com
Some of Altaro’s 9,000+ Customers Include:
Harvard University, New Zealand Defence Force, Max-Planck Institute, Los Alamos National Laboratory, Princeton University, US Geological Survey, SRI International, Swedish Defence Research Agency, Mitsubishi Polyester Film, University of Delaware, University of Miami and many more.
As database professionals there isn’t much of a reason that we actual need to go into the office, other than to make other people in the office feel better because we are actually there in person. In practice we don’t ever need to actually touch the physical hardware. Meeting can easily enough all be done via phone, video chat, etc. (trust me, I do this all the time). While it does take a little getting used to when it comes to using web conferencing and conference calls, but it really isn’t all that hard, I promise.
Working from home has some pretty big benefits when it comes to productivity. The main one that I’ve seen in my years of working from home is that my productivity when working from home goes through the roof because people can’t just drop by to chat about whatever. While they can still call or message on Skype, MSN, Yahoo, AIM, etc. those can be easily enough ignored if it isn’t something important. Without the drive by conversations it’s amazing how much more I am able to get done during the day. When it comes to getting from one meeting to another you can kill a lot of time just walking from one conference room to another. At home I just hang up the phone and dial into the next conference bridge.
From a technical perspective everything that we data folks do is remote anyway. We log onto servers using SSMS from our desktops. Even if we RDP into a server to patch it or do some “local” work, we are still remote. I’ve used remote desktop to connect to servers from WiFi in a hotel bar in Amsterdam so that I could get some work done. Once you are used to not needing to physically touch the servers you are golden.
There are some financial benefits for the companies as well. On average a standard cubical is about 8’x8′ or 64 square feet or space, per employee. Every month those 64 square feet of space need to be paid for. Figure it costs a couple of dollars per square foot for office space (I’m just guessing here, and your costs will vary widely depending on where you live) that’s a couple hundred dollars per employee, per month just to have me in the office. That doesn’t include the power for my computers, AC to keep me comfy, etc. And all of this is just to keep people happy because they can see me in the office. If it’s that important have people come in one day a week for meetings, working from the conference room or some sort of shared space which can be used for other stuff while the remote workers aren’t actually in the office.
As we hopefully all remember, happy workers stick around longer and are much more productive. If we can make our employees more productive and not spend any money doing it (and save the employee money on things like gas) why shouldn’t we?
During TechEd Europe I got a very scary phone call. A very large table was throwing errors that the value being inserted into the primary key column was overflowing the data type that makes up that column. In this case the data type in question was INT, so we were trying to stick the number 2,147,483,648 into the column and that number just doesn’t fit. The error that we got looked something like this:
System.Data.SqlClient.SqlException: Error 8115, Level 16, State 1, Procedure SomeStoredProcedure, Line 103, Message: Arithmetic overflow error converting IDENTITY to data type int. Uncommittable transaction is detected at the end of the batch. The transaction is rolled back.
Ourshort term solution was quick and easy, to change identity seed of the column from 0 to the lowest possible number, in this case -2,147,483,648 which got us back up and running pretty quickly. This however isn’t a very long term solution for this system. The application is only about 3 or 4 years old, and has been growing very quickly over the last year or so. We estimated based on the last few weeks worth of data that we would run out of negative numbers within about 12 months at the most. We sent an estimate of 9 months to the business when we advised them that the system was back up and running. We also told them that we wanted this fixed within 5-6 months to be save because if we didn’t get this fixed before running out of negative numbers there wouldn’t be any short term fix and that we’d be looking at a multi-day outage to fix the problem.
We couldn’t just rush into a database side fix, as the fix in this case is to change the data type from INT to BIGINT. As the application does use this column in a couple of places the .NET application needed to be reviewed to ensure that anything that was looking for an INT was corrected to handle the BIGINT correctly.
Based on the amount of data within the table (about 300 Gigs) it was decided that taking an outage to make the change in place wasn’t really an option as doing the size change in place would require somewhere around a 5 day outage to remove and rebuild all the non-clustered indexes. To make things a little more complex there is a second table which has a 1=1 relationship with this table, and the second table is even larger (about 800 Gigs), though thankfully the second table doesn’t have any non-clustered indexes.
The solution that was decided on was to move the data from the current table to a table with a BIGINT data type while the application was running. To do this meant that we needed to get all the data copied over to the new table and in sync while the old table was being used. It was decided that the easiest way to do this would be to use triggers. In this case instead of one complex trigger to handle all insert, update and delete operations three separate triggers were used for each of the two tables. First I created two new tables, with the exact same schema as the old tables, with the exception that the new tables used the bigint data type for the primary key instead of the int data type for the primary key and the new table was setup with the ident being the next available positive number. Once that was done the triggers were setup. The insert trigger is pretty basic. Take the data that was just loaded and stick it into the new table.
CREATE TRIGGER t_MyTable1_insert ON dbo.MyTable1
SET IDENTITY_INSERT MyTable1_bigint ON
INSERT INTO MyTable1_bigint
(Col1, Col2, Col3, Col4…)
SELECT Col1, Col2, Col3, Col4
SET IDENTITY_INSERT MyTable1_bigint OFF
The update and delete triggers required a little more logic. The trick with the triggers was that I needed to avoid doing massive implicit data conversions. In order to ensure that SQL was doing what I wanted (which it should be doing anyway, but it made me feel better doing explicit conversions) I explicit conversions into place for the JOIN predicates as shown. The update trigger is shown first, then the delete trigger.
CREATE TRIGGER t_MyTable1_update ON dbo.MyTable1
SET MyTable1_bigint.[Col2] = inserted.[Col2],
MyTable1_bigint.[Col3] = inserted.[Col3],
MyTable1_bigint.[Col4] = inserted.[Col4],
WHERE cast(inserted.Col1 as bigint) = MyTable1_bigint.Col1
CREATE TRIGGER t_MyTable1_delete ON dbo.MyTable1
DELETE FROM MyTable1_bigint
WHERE Col1 in (SELECT cast(Col1 as bigint) FROM deleted)
Once these tables were up and running all the new data changes were being loaded into the table. At this point it was just a matter of coping the existing data into the table. There are a few ways that this can be done. In this case I opted for an SSIS package with a single data pump task, and two data transformations within the data pump task with one transformation for each table.
In order to make the load as fast as possible I used the fast load option and loaded 1000 rows at a time. Within the data task if there was an error I redirected the rows to another data pump task which simply dropped the rows into the same table, but this time going row by row. Any failures from that import were simply ignored. While handling failed rows like this is time consuming it is easier than running T-SQL scripts to verify which rows are needed and which rows aren’t needed. SSIS also gives an easy option to ignore the foreign key relationship between the two tables so if the child table gets rows first that isn’t a problem as we know that the parent table will catch up. The SSIS package looked like this:
When all is said and done and the data is in sync between the new and old tables, the current tables will be dropped and the new tables will be renamed and put into place so that the application can continue to run without issue, with just a few minutes of downtime.
So why did this happen? When the applications database was being designed the developers didn’t think about how many rows the database was going to get over time, so they didn’t account for needing to support more than 2.1 billion rows over time. If I (or another data architect) had been involved in the project at it’s start this hopefully would have been caught at design time. However when the application was first being designed the company was brand new and didn’t have the funds for a data architect to help with the application design so this problem was missed.
Hopefully you never hit this problem, but if you do this helps you get out of it.
One of the companies which I work with has the occasion to create new databases when they do releases of their software. Normally this isn’t a problem, except that they are setup to use maintenance plans to handle the backup and pruning of their transaction logs. As all the new databases are created in the full recovery model this can end up causing some problems for them as within 12 minutes they start getting emails saying that the transaction log backup job has failed. And these emails will keep coming in, possibly for hours until the full backup job kicks in later that night.
To solve this problem, I added a DDL trigger to the server which will cause the new database to be backed up as soon as the database is created. The trigger itself is rather simple. Most of the trigger is setting variables. Then I make sure that the database isn’t a database snapshot, as database snapshots can’t be backed up. If it isn’t a snapshot we continue with everything else.
Then I create a subfolder for the backups to be put into (the backups for each database go into their own folder, so as this is a new database the folder needs to be created). Then I commit the transaction, as database backups can’t be taken within a transaction. Then we do the actual database backup. I then throw a message to the end user using the RAISERROR statement telling them that they can ignore the other error that they are going to get about the transaction being closed before the trigger was complete. This is just an annoyance of my needing to commit the transaction before taking the backup. Sure I could have setup a job which takes the backup and emails if there was a failure, but that just seems to complex for something so simple. The code for my trigger is below.
CREATE TRIGGER BackupNewDatabase
ON ALL SERVER
declare @database sysname, @event_data XML = EVENTDATA(), @folder nvarchar(4000), @file nvarchar(4000)
SET @database = @event_data.value(‘(/EVENT_INSTANCE/DatabaseName)’, ‘sysname’)
set @folder = ‘X:\Backups\’ + @database
set @file = @folder + ‘\’ + @database + ‘.bak’
if exists (select * from sys.databases where name = @database and source_database_id is null)
EXEC master.dbo.xp_create_subdir @folder
BACKUP DATABASE @database to disk=@file
raiserror( ‘You can ignore the error message which says that the transaction ended within the trigger.’, 16,1)
Hopefully you find this solution helpful if you get into a situation like this,
Today was the 1st day of the SQL PASS Summit and the keynote was full of some great announcements and demos. The keynotes are a massive with almost all of the 3894 conference attendees from 57 different countries.
All of the keynotes from this years conference not only are the keynotes being streamed live, but many of the sessions are also being streamed live via the SQL PASS website on PASS TV. The full schedule can found on the PASS TV page.
Bill Grazino told us turning his keynote that pass has sponsored 9 non-PASS events around the world. Including these 9 events, pass has provided well over 500,000 hours of training to members of the community, much of it like SQL Saturday’s, User Groups, etc. being available to the attendees free of charge. PASS has created a business analytics specific conference which will be held April 10-12 2013 in Chicago, IL. More information will be available on the SQL PASS website shortly.
When Ted Kummert took the stage he talked about some of the great products that Microsoft has released this year including SQL Server 2012, the Surface, Windows Server 2012, Office 2013 as well as many others.
Ted announced that SQL Server 2012 has been released as of today. This release aligns with Office 2013 and the new SharePoint 2012. Ted also announced project Hekaton which is a new in memory database engine which will be introduced within the next major release of SQL Server. Project Hekaton is a fully in-memory database engine which will greatly improve OLTP performance. While this new engine is a part of the SQL Server database engine, this portion has been totally rewritten to really maximize the hardware under the SQL Server database engine.
Including Product Hekaton is the Hekaton AMR Tool which (when renamed) will help identify bottlenecks which can be resolved by converting the database to a Hekaton database engine. Because the Hekaton engine is running with the entire database being in memory latching has effectively been removed. Hekaton also supports recompiling stored procedures from running T-SQL to running as native machine code. This allows the stored procedures to run much, much faster without any actual code change. While not all workloads are going to be right for the Hekaton database engine, but those that are should see a massive performance improvement. Because project Hekaton runs within the same database engine that we have today if you know SQL Server you will already know how to work with a project Hekaton database.
Another great announcement is the clustered column store which will be introduced in the next major release of SQL Server. Along with the clustered column store index we will also be able to update tables which have column store indexes created on them, no matter if the column store index is a clustered column store index or a non-clustered column store index. This will introduce massive performance improvements to real time data warehouses and make data warehouses much easier to setup and performance tune; even with the largest workload.
For the non-relational side of things you can today download the Microsoft HDInsight Server CTP which is Microsoft’s Hadoop offering.
Ted also announced a new update to the Microsoft PDW which will be available in the 1st half of 2013 and will be called the SQL Server 2012 Parallel Data Warehouse. This update will include some of the features which will be included in the next major release of SQL Server such as the updatable column store index. Another big enhancement of the new SQL Server 2012 PDW is that SQL Server Management Studio will now be the query tool which is used to run the queries against the PDW system. During the demo against the PDW doing a simple count against a 1PB table the query ran in just a few seconds. Doing data aggregations against the same table for a real world style reporting query the query again ran in just a few seconds.
The next announcement was Polybase. Polybase allows you to query data from an external source from within the SQL Server 2012 PDW. This will allow you to create an external table which simply links to a Hadoop cluster. This will be done with the CREATE EXTERNAL TABLE statement which accepts the schema that you want and a hdfs URL to Hadoop without needing to create a linked server to the Hadoop cluster.
The next step of BI is here in SQL Server 2012 SP1 and Office 2013. Office 2013 includes PowerView natively within the product. It also supports some new mapping features natively within Office 2013 without any additional downloads or plugins. This support includes term extraction which is done through HDInsight. This allows us to view text data and extract names, sentiment, locations, etc. from simple text data and use those extracted strings in order to build PowerView reports. PowerView also now supports DAX queries to a MOLAP cube straight out of the box.
Stay tuned for more updates from the PASS Summit through out the week.
With the release of SQL Server 2012 and specifically the AlwaysOn feature we need to rethink the way that we handle the licensing for servers that will be passive servers. In the past it’s been assumed that passive servers didn’t need to be licensed, and that was true, for the most part. The official docs read that you get one passive server per active server that you have licensed, provided that you have a current Enterprise Agreement. Without the Enterprise Agreement in place any and all passive servers need to be licensed.
With SQL Server 2012’s AlwaysOn feature we have an active server called the Primary Replica and we have up to 4 secondary replicas. Even if none of those 4 secondary replicas are in use for anything, you will still need to license two of them to be properly licensed. This is because when licensing SQL Server’s each licensed server gets you only a single free passive server. So for a 5 instance AlwaysOn Availability Group deployment you’ll need to license at least 3 of those instances which would give you two passive instances. As long as those two passive instances aren’t being used for read access they are free.
While I applaud the hard work and ingenuity of the sysadmins who works through the days after hurricane Sandy to keep the generators running, one thing kept coming to my mind. Why was this needed?
I’ve moved lots of companies into data centers before, and each time I’ve done so it has included a site visit during to RFP process to check on things like cooling, generator, site configuration, etc. During these site visits I stick my head through every door that the sales guy and the engineer who are doing the tour will open. If they won’t open a door for me they better have a damn good reason to not show me what’s in there.
If I went to do a site visit at a facility that was located just at sea level just a few blocks from the ocean, they’d better have some pretty impressive plans for how they are going to handle flooding. If the answer was “we’ve never had a problem with flooding” or something similar they’d be off my list as they haven’t done their due diligence to insure that the site will be online during the worst emergencies possible.
Now before you start telling me that I’ve got no idea what I’m talking about, and that data centers in the North East have different problems from data centers in the South West. I actually do as I’ve moved companies into data centers on both coasts. Most recently I moved the data center for Phreesia from a managed service facility in Texas to a colo in New Jersey. As a part of this move we looked at a number of facilities in the New York / New Jersey area. Many of the New York City data centers were eliminated due to cost, or being just to close to the water as we didn’t want to deal with situations like this exact one.
The data center we settled on is in New Jersey about a 30-40 minute drive from the Phreesia office (once you get out of New York City). While the data center is near a river, the river is a little over a mile away. The data center itself is on a slope with a cinder block wall on the outer edge which will divert water away in the event of a river overflow (it also protects the facility from someone driving a car or truck into the facility). The generators and fuel pumps are all mounted on raised feet (not very tall, but tall enough) so that they keep running in the event of a flood. The cables from the generators to the building have all been buried under the ground so that tree branches which are torn loose during the hurricane can’t cut those cables.
Our diligence in selecting a data center paid off. While the folks mentioned in that article were dragging buckets of fuel up 17 floors worth of stairs the team at Phreesia just sat back and rode out the storm with their families. The sites didn’t go down and the team didn’t have to rush into a very hazardous situation. The team was able to focus on their families and neighbors without having to worry about the systems. Those of us that aren’t in the New York area monitored the systems remotely and founds that everything was running perfectly normally just on generator power instead of city power.
This entire event just shows that when doing data center evaluations the worst possible case situation needs to be planned for and expected. If they aren’t they are going to come back and bite you. Especially in todays world of storms which are ever increasing in destructive power.
If you are in a facility which has risks such as fuel pumps which are below sea level (they are in the basement and the road is at sea level) then a second data center becomes very important very quickly. This became very clear during this hurricane when some very large important websites went offline because they didn’t have that DR facility that they could fail over to, unlike sites like StackOverflow (and ServerFault and the Stack Exchange network).
If you are at risk now is an excellent time to sit down with management and go over all the “what ifs” of your current hosting solution and think about the cost of a DR site along with the cost of not having one.