SQL Server with Mr. Denny


December 20, 2011  2:00 PM

Me, Sean and Jen all talking politics & more

Denny Cherry Denny Cherry Profile: Denny Cherry

So last weekend I had the chance to fly over to Dallas and hang with Sean and Jen McCown (the Mignight DBAs) and be the guest on their web show all live and in person for the episode “Bring in the Nerds“.  Sean has informed me that the recording it up on the site, and available for your viewing.  I get there about 10 or 15 minutes into the show, as I had a late flight out there.

Just do keep in mind that there was beer involved so the language might require headsets is there’s a manager or child around.

Denny

December 19, 2011  7:36 PM

Look out SQL Bits, here I come!

Denny Cherry Denny Cherry Profile: Denny Cherry

That’s right, I’m headed out to the SQL Bits conference in London, England.  Apparently I’ve been on my best behavior this year, because the guys at SQL Bits have even selected me to deliver a pre-con on the Training Day.

My training day class will be a full day presentation focused on Storage and Virtualization for the DBA (or other database professional).  So get signed up for the training day session and learn what you need to know to talk to the storage team so that they don’t look at you like you are a bumbling fool.  Don’t wait, get registered for SQL Bits and my training day session today.

I’ll see you there, I can’t wait.

Denny


December 19, 2011  2:00 PM

2011 Blog Post Wrap Up

Denny Cherry Denny Cherry Profile: Denny Cherry

With the end of 2011 quickly approaching, I decided to throw all the stats from this blog into Excel so I can see what the most popular blog posts that I’ve written are.  The results are pretty interesting and provide a bit of a view into what people are reading on my blog over the course of the year.

So the most popular blog posts that people have looked at this year are:

#1 – How To Configure DTC on Windows 2008

#2 – Back To Basics: The Update Statement

#3 – Back To Basics: Clustered vs. NonClustered Indexes

#4 – What Exactly Is MSDTC And When Do I Need It

#5 – New Insert Syntax in SQL Server 2008

#6 – Difference Between an Index And a Primary Key

#7 – SQL Server 2012 Licensing Changes

#8 – If You Aren’t Doing Backups You Aren’t Doing Your Job

#9 – Single Instance vs. Multiple Instance

#10 – You Can’t Deploy A SQL 2008 SSRS Report to a SQL 2005 SSRS Server

When I was looking at the data for these blog posts I thought it was kind of interesting that the things that people was to read about the most are MS-DTC, and the more intro level material.  So I’m going to take this information and attempt to use it over the course of 2012 and write more towards the kinds of things that people are looking for.  I can only assume that people are finding these blog posts via search (some of them like the DTC posts are a couple years old at this point) so these are the kinds of things that people want to see.

Another interesting thing that I saw was that over the course of 2011 the page views for the most popular pages varied widely.  The most popular post (the DTC post) had over 18k page views this year, while #10 (the SSRS post) had just a little over 3k page views.  That is quite a wide gap between the two (technically the second most popular page that people viewed was the home page of my blog as that came in at #2, but I excluded that from the list because who knows what they were reading).

If you have a blog I encourage you to look at your metrics if you aren’t already and see what people are reading, then write more about that topic.  Exclude the outliers, like my SQL 2012 licensing post, that are one offs and that you know don’t need to be beat on week after week, month after month.  But when you are writing about topics, make sure that they are interesting to you (if they aren’t then the posts will be bland and boring to your readers) and that you are putting your own spin on the topic, not writing about the same stuff that everyone else has written.

Denny


December 15, 2011  2:00 PM

3rd Party Performance Monitoring Tools & CXPACKET

Denny Cherry Denny Cherry Profile: Denny Cherry

There are lots of great 3rd party monitoring tools on the market from a variety of vendors.  I want to start this post by saying up front that I’m not bashing them, or throwing stones.  That said…

Some of the time these third party apps don’t handle reporting CXPACKET waits very well.  The reason for this, usually, is that the third party apps typically show the waittype of the thread which has been waiting the longest for that session.  Often this will be the CXPACKET thread.  Because of this it can be hard to use the 3rd party apps to find the actual root cause of the problem.

To find the root cause of these problems which look like CXPACKET problems you need to find the threads which aren’t waiting on CXPACKET.  There are a few ways you can do this, probably the easiest is to query the sys.sysprocesses dynamic management view.  While this view is deprecated it isn’t going anywhere yet and when it does you can still get this information from the other DMVs.  Simply query sys.sysprocesses (or dbo.sysprocesses if you are on SQL 2000) using the problem SPID as the WHERE clause.

SELECT *
FROM sys.sysprocesses
WHERE SPID = {Some Session ID}

You can then look at all the wait types that are being reported for that SPID and what other the other wait types are is the problem.  Typically when I see this it’s usually an IO wait type or SOS_SCHEDULER_YIELD.  If the problem is SOS_SCHEDULER_YIELD then (in my experience) you’ve probably got an indexing problem that needs to be resolved.  You’ll then want to grab the execution plan for that query (sp_whoisactive is a great way to get the execution plan using the @get_plans=1 parameter) and figure out the missing index problem.

Now there can be other reasons that you’ll be CXPACKETed, so please don’t think that this will work every time. But it should work a lot of the time.

Denny


December 12, 2011  2:00 PM

Fixing Performance in Vendor Apps

Denny Cherry Denny Cherry Profile: Denny Cherry

Lots of people out there manage 3rd party vendor apps.  However as anyone who has managed a vendor app can tell you, you don’t really manage the app.  You simply backup the vendor database (assuming that they allow you to backup the database) and call the vendor when there are performance problems.  Doing anything other than this will basically invalidate your support agreement with the vendor and prevent you from upgrading in the future.

Now in reality many of these 3rd party vendors don’t have anyone no staff who knows much about SQL Server, or databases in general.  You sure don’t get the ability to talk to anyone that these 3rd party vendors have on staff that actually do know anything about databases.  So when you start seeing performance problems you basically have a few different options.

1. Call their support line and talk to someone who thinks there is an “E” in SQL (they spell it sequel).

2. Leave the system with performance problems.

3. Fix the performance problem and hope that the vendor doesn’t notice the indexes that you created.

Obviously option #2 probably isn’t the one that the business unit that uses the application is going to accept.  #1 is what they want you to do, but that’s probably going to be less than useful possibly taking days or weeks in order to get the issue resolved.

In reality #3 is the option that you want to go with, but there is that little problem of the support agreement to deal with.  Obviously I would never recommend invalidating your support agreement … but when you need to get your production application back up and running, you need to get the system back up and running.  Now you can’t do crazy things like change the database schema but usually that isn’t needed to get everything back up and running.

Usually it’s something as simple as updating statistics or creating a new index.  Now you need to hide this index from the vendor so that they don’t complain that you’ve “broken” the system by fixing it.  That’s why I recommend creating the indexes so that they are easy to identify later.  When I create indexes in a vendor system I prefix them with something that the vendor will never use, like my name.  This way I can quickly query for all the indexes and drop them the next time I need to let the vendor into the database (don’t forget to script them out first so you can but them back afterwards).

While the vendor will probably complain that you’ve added these indexes they can’t actually say that you’ve done anything damaging if you’ve removed them as well.  Not to mention they probably won’t actually be able to find out if you’ve created and dropped indexes if you’ve removed them already.

While this probably isn’t the best way to have to deal with 3rd party vendor applications it is the reality of dealing with 3rd party vendors.

Denny


December 8, 2011  8:43 PM

TweetDeck 1.0 has been released.

Denny Cherry Denny Cherry Profile: Denny Cherry

Yep, Twitter (who now owns TweetDeck) have released v1 of TweetDeck.  My first impression is “Don’t do it, it’s a trap”.  At first I was all happy when I downloaded v1 because it wasn’t an Abobe Air application.  Just a good old MSI installer which installed an HTML5 application.  The downside is that they totally screwed up the app.  Here are a list of the things which I don’t like (this list may grow over time):

  1. To change the notification settings for each column, I have to click a different settings button for each column.
  2. The Send tweet doesn’t show at the bottom (or top) like it did before.  It pops up in the middle of the app, can’t be moved, and grey’s out the app.
  3. Hitting Enter in the send tweet screen doesn’t send the tweet.
  4. Going along with the prior two I now have to use my mouse to send a tweet (twice) where before I could just ALT+Tab to TweetDeck send a tweet and Alt+Tab back.
  5. It defaults to using Twitter for URL shortening and image uploads.  It has the old settings in there, when I changed it to bit.ly it still had my username and key in there.
  6. It didn’t retain all my column information.
  7. The Add column process is not intuitive.  First you select the column then the account which that column applies to.  Totally backwards.
  8. The Direct Message column is now the Inbox and shows in Conversation view so I have to click on the conversation to see the entire DM that just came in.  (See pic.)
  9. There’s no language conversation feature.
  10. There’s no indicator that I have or haven’t read a tweet.
  11. There’s no support for LinkedIn, Buzz, FourSquare or MySpace.  Only FaceBook and Twitter are supported.
  12. The colors and font aren’t changeable.
  13. The escape key doesn’t close the settings dialog window.
  14. I’ve got not notification popups enabled, but they don’t appear to actually do anything.
  15. Closing the column settings window requires clicking exactly on the settings button above. Clicking on the column opens that specific tweet, which is pretty much never what I want.
  16. Proxies which require authentication appear to not be supported. Either when the workstation is on the domain or is a standalone machine.

So far the new TweetDeck v1 does one thing and only one thing better than the prior versions.  It leave the prior version installed so that you can still use the older v0.38.1 build that we all have been using for months.  Which I am switching back to … right about now.


December 8, 2011  2:00 PM

MVP Summit First Timers, I’ve got a webcast for you

Denny Cherry Denny Cherry Profile: Denny Cherry

So for all those new (and returning) Microsoft MVPs who are going to be attending the MVP summit (in Feburary 2012) for the first time I’m putting together a little web cast just for you so that you don’t feel quite so lost during the summit.  After all this is the one time of the year that we get the bulk of the Microsoft MVPs together and have some fun, haze the new guys and learn some stuff that we can’t talk about with anyone.

The webcast will be January 10th, 2012 at 1pm Pacific Time (8pm GMT) at this website.  There’s no sign up needed, just give your name, company, etc. on the date in question.  This is a totally informal, unofficial webcast that I’m putting together because I remember what it was like to be a first timer at the MVP summit and be totally lost when I got there.  I’ve even put together an Office Calender entry (link has been changed and it working now) to remind you about the meeting.

The day of the meeting do NOT use the web client, only use the actual live meeting client.  The web client sucks and the audio won’t work.

I will be recording the meeting and posting it online after the fact.

Denny


December 5, 2011  2:00 PM

Schema Design Changes shouldn’t just be done once

Denny Cherry Denny Cherry Profile: Denny Cherry

Just because you did a schema design change before doesn’t mean that you shouldn’t look into doing another one.  A while back, before I started working with Phreesia, they decided to fix some performance problems by federating some data.  The data is question is basically a large named value pair table which holds survey answers.  The performance problem that they were trying to solve was that they allow people to fill out the survey more than once, and when they change an answer only the changed values are saved.  Figuring out the newest set of answers became a very expensive query to run.  Because of this, and to make index rebuilding faster it was decided to federate the data across multiple tables basically named answer_1, answer_2, answer_3, etc.

This was working fine for a good year or two, but then the system grew.  As the database got more data loaded into the tables the number of tables that were a member of the federation grew to 125.  There was a view which was used for reading data which did a UNION ALL between the 125 tables.  Querying this view required massive amounts of memory just for storing the execution plans, a whole 8 Megs per execution plan that hit the view.  And there were two of these views.  This is because the blob data was being stored in another set of federated tables.  The first view was just the answer_{n} table and the other view joined all the answer_{n} tables to the tables holding the blob data called answer_lob_{n}.  The way the tables were designed was that if the meta data for the lob data was in answer_1 then the actual lob data was in answer_lob_1 and if the meta data was in answer_2 then the lob data was stored in answer_lob_2.  This made the definition of the two views to look something like this.

CREATE VIEW basedata
AS
SELECT *
FROM answer_1
UNION ALL
SELECT *
FROM answer_2
UNION ALL
...
GO
CREATE VIEW lobdata
AS
SELECT *
FROM answer_1
JOIN answer_lob_1 on answer_1.AnswerId = answer_lob_1.AnswerId
UNION ALL
SELECT *
FROM answer_2
JOIN answer_lob_2 on answer_2.AnswerId = answer_lob_2.AnswerId
UNION ALL
...
GO

Now when monitoring this server the CPU was running at about 30% during business hours and the PFE was running about 21k seconds (about 5.8 hours). The server has 24 cores in it running at about 2.8Ghz and 256 Gigs of memory.

Working with the development team we decided to move the data from the answer_{n} tables into a single table called answer and the data from the answer_lob_{n} tables into a single table called answer_lob. Making this change on this system was actually pretty easy. Thankfully 100% of the access to these tables was done through stored procedures within the database. Some of the procedures were using dynamic SQL, such as to insert the data so that the data would get in the correct table, but there weren’t all that may stored procedures in the database so there weren’t that many to deal with.

When we made this schema change there was another very important change that we made. We added a bit column to the answer table which specified identified the row as being the most current version of the person’s answer. This would then allow us to created filtered indexes on the table that had this bit column in the where clause so that we could create VERY small indexes when compared to the total size of the table.

Because of the changes in the schema, removing the views, ditching all the dynamic SQL, and filtering the indexes we got some major performance changes. The CPU on the SQL Server went from 30% to 5% during normal business hours (when we made the schema change, this was the only change that we released during that release so that we could be sure that there were no problems introduced by the change). On the PLE metric I mentioned above we went from a PLE of 21k to a massive 35k which is 9.7 hours. This is a major improvement on both metrics.

Now the big question is, why didn’t they make the same changes that I had them make before? The answer is that a lot of the SQL Server features that we used (specifically the filtered indexes) weren’t available before because back then the SQL Server was running on SQL Server 2005. One of the changes which I made to the system in order to get the prepped for these schema changes included upgrading to SQL Server 2008 R2 so that we could take advantage of filtered indexes and “optimize for ad hoc workloads” (there’s a lot of dynamic SQL as the developers like to use linq for a lot of the data access for smaller parts of the system), just not for these specific tables.

As you can hopefully see from this example, that just because schema changes were made by someone in the past (these changes were made by some VERY capable consultants) when new technology comes out that you now have available to you it can be really worth it to make changes to the schema to take advantage of this new technology. With these changes that have now been made this system can grow about 19-20x the current load before upgrading is required were with the old schema and performance the system could only grow 2-3x the current load. And that means a much longer life for the server and much less money needing to be spent upgrading the server. Also if we had waited until the server had been upgraded to make the schema changes moving the data from the old tables into the new table would have taken weeks or months instead of the 10 days or so that it took to migrate the data. You see the new answer table has 156 Gigs (500 million rows) of data plus 500 Gigs of indexes and the answer_lob table has 455 Gigs of data (believe it or not this table has only 1.5 Gigs of non-clustered indexes on it).

If you have been putting off doing a schema redesign even after upgrading your database application to a new version of SQL Server (or what ever your database platform is) it might be a good time to look at the system and see what changes you can make.

Denny


December 1, 2011  2:00 PM

When learning database design, be careful where you learn from

Denny Cherry Denny Cherry Profile: Denny Cherry

Every year we have new people joining our ranks as database professionals.  Many of these new people use systems at the places which they already work as the source of their learning when it comes to database design and data modeling.  However this isn’t always the best idea unless the systems which they are referencing have been properly designed.

One system which I urge new people to the field to totally ignore as a reference source is SharePoint.  Microsoft’s SharePoint has what we in the database field call a “really screwed up database design”.  The SharePoint database design is so screwed up that the SQL Server Product team has had to add features to the SQL Server product so that the SharePoint team can use them in future versions of the SharePoint product.

Now I understand why people want to model their applications on the SharePoint database.  It’s a very scalable application that can handle pretty much anything which you throw at it.  However as a model for other applications it pretty much sucks.  Being that scalable comes with a performance hit, as the database gets larger that performance hit just gets larger and larger.  This is just one of the reasons that it is recommended that the SharePoint content databases be limited in size to 100 Gigs for SharePoint (MOSS) 2007 and 200 Gigs for SharePoint 2010.

The problem then becomes how are these new people supposed to know what a crappy database design looks like so that they know not to model other databases like it?  That falls to the people that they work with to make sure that they have good solid systems to learn how to build applications properly.  When they work in a small shop as the only developer there isn’t anyone to learn from.  That’s where blogs and websites from known people in the field become key to learning how to do things correctly.

But what ever you do, don’t use SharePoint as the source of your database design knowledge.  It is about the worst possible example you can use.  I understand that Forefront Identity Manager is actually worse, but I haven’t dug into a FIM database yet.  I will be looking into a FIM system that was setup so that I can see just how bad that database design is.

Denny


November 28, 2011  2:00 PM

Using CNAMEs with clustered file share resources

Denny Cherry Denny Cherry Profile: Denny Cherry

One of the questions that I see on the forums every once in a while is how to get a CNAME working with a network share.  For example if you have a cluster where the server names are “SERVER1 and SERVER2″ and the virtual name is “VIRT1″ how can I allow users to connect to the network shares by using the server name “FILESERVER” so that they don’t need to change their mapped drives, etc. after the server FILESERVER has been migrated to the cluster.

The old way of doing this would have been to simple add a CNAME to the internal DNS server so that the name FILESERVER simply pointed to the name VIRT1.  However in Windows 2008 and higher this doesn’t work any more without doing some registry hacking.  While the registry hacking will work, this is a pain to maintain as every time you migrate another server to the cluster you need to go and edit the registry key on all the nodes of the cluster, then reboot all the nodes of the cluster.  Not exactly the easiest solution to implement.

Fortunately there is a much easier technique which can be used.  Simply add another network name to the cluster pointing to the network IP address of the virtual world.  This will add another DNS entry so that your users can resolve the network name correctly and register the network name with the cluster so that it knows to accept this network name as a valid name for the cluster without needing to restart the nodes of the cluster or to edit the registry on all the nodes of the cluster.

To do this open the failover cluster manger and right click on the service or application and select “Add a Resource” from the context menu.  Then select “1 – Client Access Point”.  The wizard that opens will allow you to add a network name to the cluster resource group.

Denny


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: