Back in the old days one of the big reasons that people enforced such strict rules on data normalization was because it would greatly increase the amount of information you could fit on the hard drive. Back when everything was measured in kilobytes or megabytes storing redundant information simply wasn’t an option. You had to be able to cram as much information into that little database.
These days space isn’t the issue. Storage is very cheep these days, and if you have the money you can get absolutely massive amounts of storage. 1 TB hard drives are easily found, and when you start talking about RAIDing drives together the storage limits start to become just insane. 100 TB of storage just isn’t what it used to be.
The problem now becomes that with so much data for the database platform to search through we need to make sure that the indexes which SQL Server is searching as as small as possible so that we can get those indexes read from the disk and into memory as quickly as possible.
Say we have an Employee table which looks like this.
CREATE TABLE Employee
Now this table is clearly not setup as a normalized table. We are assuming that this table has best case indexing. If we have 1000 employees records and we are going to search the Department column the index must be read from disk. Now assuming that the department names are nice long bureaucratic names which average 30 characters each we have about 30000 bytes (more if you take the fill factor into account) which needs to be read from the disk. Now if we had a relational version of the table.
CREATE TABLE Employee
When we search the Employee table we now are only loading an integer value from the disk into memory. This means that we only have to load 8000 bytes of data from the disk into memory. Now knowing that SQL Server reads data from the disk in 8k blocks even with a 50% fill factor the index only takes 2 blocks on the disk and can then be completed in two read operations to the disk. Now our original table with no fill factor will require 30 data pages (20.297 to be specific), about 60 if we use the same fill factor.
With tables of this size this isn’t much. But as your database grows the number of additional data pages which would have to be read for a simple search of the table grows exponentially.
The SQL Server Service Broker is a fantastic new addition to SQL Server. For those who have used Microsoft Message Queue (MSMQ) the service broker will be easy to understand as it’s the same basic concept. Messages are sent from one place to another, within a database, from database to database, or from server to server.
While the configuration can be a daunting task, once the service broker is setup it is a very solid system which can handle a large message load.
The Service Broker give you guaranteed delivery order, with single processing of messages always in the order received. Messages can be processed on demand (you write software which queries the service broker on a schedule) or automatically via the queue activation.
Check back for information about all the various service broker object an how to configure each of them.
Quest Software has just published the first of a few videos which I recorded with them.
You can download or view the video from here.
I’ve got to tell you, I had an absolute blast recording the video.
The nice folks at the Inland Empire .NET User Group have invited me to come and speak to them. This is a great speaking opportunity for me as they are about 20 minutes from my house.
I won’t be speaking there until December 9, 2008 (it’s amazing just how far in advance some of this stuff gets scheduled).
I’ll be giving two presentations at the meeting. The first will be the ever popular Query Tuning, and the second will be a talk on the SQL Server 2008 Resource Governor. The address and directions to the meeting can be found on the IE .NET User Group web site. If you are going to attend there meetings they have an RSVP link on the site.
I’ll post about this meeting again closer to the meeting.
As we all know SQL Server, will for good or bad, cache execution plans. Up until now it’s been very tough to see the cached execution plans. You pretty much had to grab the query or procedure (with the parameters) and run it with Query Analyzer and have it return the execution plan. The only other option was to use SQL Profiler to capture the query along with the execution plan which it used.
Starting in SQL Server 2005 you now have the ability to query the system catalog and see what query plans are stored in the cache, as well as the queries which trigger that plan to be used, as well as how many times that cached plan has been used.
The first DMV to look at is sys.dm_exec_cached_plans. This DMV gives you the size of the plan, the handle of the plan (which is binary), and the number of times the plan has been used.
The next DMV to look at is sys.dm_exec_requests. This DMV gives you the currently running processes, wait information, all the session settings for the processes, etc. It also includes the handle for all the plans which are currently being run. If you join this sys.dm_exec_requests to sys.dm_exec_cached_plans you can see exactly which processes are using which plans. Because this shows you the spid (the session_id column) and the plan_handle you can see what users are running which commands.
The next object is a system table function called sys.dm_exec_query_plan. This function accepts a single input parameter of a plan handle. There are several columns in the output, the most important of which is the query_plan. This is the actual xml plan which is stored in the SQL Server.
The last object we’ll be looking at is the sys.dm_exec_plan_attributes table function. This function also accepts a single input parameter of a plan handle. This function returns the various session settings which were in place when the plan was created. This is important information to have when working with query plans, as changing a single ANSI connection setting will force SQL to create a new execution plan.
Don’t forget how to view the XML execution plan in the UI.
I’ve seen some very creative ways to see how much free disk space SQL Servers have. Most make use of xp_cmdshell which some very complex dos commands or vbscripts.
However there is a much easier way. Microsoft has included the system extended stored procedure xp_fixeddrives. It returns the all the fixed drives and the amount of free space in megs.
That “s” is new in SQL Server 2005. It means that the SPID is a system process. For example my database is having a problem with tempdb running out of space for no reason. When I look in the log I see that SPID 118s is the offending SPID. When I watch SPID 118 using sp_who3 I see that the output looks a little strange. I see a lot of CPU and disk load and that the thread and been logged in for ever using the sa account but with no hostname, host process, or mac address listed. This is because this is the service broker process which is what runs the activated procedures. Starting in SQL Server 2005 system processes no longer need to run with SPIDs less than 50. Because of this, Microsoft has decided to make live easier when trying to identify the system processes by putting the s after the SPID number in the logs.
This same information appears to hold true for SQL Server 2008 (so far at least).
When you have data in your table that you need to remove the DELETE statement is the way to do that. Using the DELETE statement without any WHERE clause will remove all the data from your table. The WHERE clause works the same way as the WHERE clause for SELECT and UPDATE statements.
DELETE FROM Employee
WHERE EmployeeId = 4
You can also JOIN to a table when you delete data from a table which allows you to use the second table within the filter. This requires a slightly different syntax than your normal DELETE syntax.
FROM Employee e
JOIN Department ON Employee.DepartmentId = Department.DepartmentId
AND Department.DepartmentName = 'Sales'
As you can see you need to set an alias for the table you are deleting from, and put that alias between the DELETE and FROM words.
P.S. Sorry this didn’t come up sooner, I could have sworn that I wrote it and published it back when I started the series.
I’ve just published a new tip on SearchSQLServer.com called Configure SQL Server Service Broker for sending stored procedure data. In it I’m going over SQL Service Broker and how to configure all the objects.
As a followup to this I’ll be going over all these objects as part of the Back To Basics series over the next several weeks.
While I think that the one click database encryption that Microsoft has built into SQL Server 2008 is a good idea, but I’m not sure how useful it actually is. It’s touted as giving you data encryption of the entire database without any code change.
What this actually means is that if someone stops the SQL Server and copies the data files to another server and attaches the files to that SQL Server the files won’t be of any use. This however isn’t the common way that data is stolen from a SQL Server. A much more common method is someone logs into the database using a perfectly legit account and exports the table or tables which contain the useful information to Excel, Access, csv, etc. This method of database encryption does nothing to protect against this as if you query the table directly you are displayed the data in clear text.
All this database encryption does (as it appears to me) is that it protects the data at rest (ie. when the SQL Server service is stopped) and does nothing else to protect the data. If all you are looking for is a compliance check off saying that you data is encrypted for SOX, HIPAA, the California Online Privacy Protection Act of 2003, etc then this will probably do the trick. If you want actual data encryption where the data is not viewable when you query the table unless you know the key to unlock the data this isn’t it.
If you do want that quick and easy encryption for a compliance check off be sure to check your data storage requirements and IO requirements. Encrypted data is typically larger than unencrypted data and you may need to increase your allocated storage or IO requirements in order to have the encryption completed. Also remember that the higher level of encryption you use the more CPU time is required to decrypt the data.
Feel free to share your opinions below. I’m interested to here what others have to say on the topic.