EMC World was a blast this year. I’ve posted some highlights of the event through out the week. I’ll continue to post as much of the info as I can through out the next few weeks.
Wednesday at EMC World was a session packed day to be sure.
I started my morning with CLARiiON Rebuild Settings and Data which was an in depth look at how exactly the CLARiiON systems handle rebuilds, and how long various rebuilds take. In addition we went into detail as to how the CLARiiON will not see that a disk is going to fail, and instead of waiting for it to fail and then having to rebuild it will actually copy the data from the disk ahead of time to a hot-spare then mark the disk as failed so that it can be replaced before the disk actually fails there by providing you with no point in time where your data is unprotected.
Another session went through the changes to the EMC PowerPath product and all the new features they are building into PowerPath such as the encryption of data when it leaves the server on its way to the disk.
Another session went through some tuning tips an tricks for getting the best performance from the EMC CLARiiON product. These include the strip size, cache settings at both the LUN and SP level.
The night ended with the Billy Crystal performance. I figured that the show would be funny. I was wrong, it was hilarious. Like everyone else I’ve seen Billy Crystal on TV and in Movies and thought he was funny, but in person he was probably the best comedy performance that I had ever seen.
Yesterday was an very busy day. I didn’t have time to think, much less put together a post about it. I hit every session which I was looking for including the always hard to get into Navisphere Manager Hands-on workshop.
The session I probably got the most out of was the session on what’s new in the FLARE version 26 which was released a few months ago.
FLARE 26 now supports Active/Active presentation of the LUNs. What this means is that in the event of a fibre cut on either the front end or the back end the host machine (Server) will no longer need to trespass the LUN to the other SP. The LUN can simply send the IO request to the other SP. The non-preferred SP will then forward the request to the preferred SP automatically for completion. Upon the preferred SPs connectivity coming back online the requests will then be sent to the preferred SP. The newest version of PowerPath is required for this to work, or the native multipathing driver such as the Windows 2008 driver must support ALVA.
The support for supporting a broken connection between the host and the storage is from ALVA. The support for handling the request when the connection is broken between the SP and the DAE is an EMC only extension of ALVA.
FLARE 26 also includes RAID 6 support. When comparing RAID 6 with RAID 5 on the same system read performance will typically be better as the data is spread across all the drives in the RAID 6 array. Unlike a lot of other systems the EMC CLARiiON array spreads the parity sectors of (RAID 5 and) RAID 6 across all the drives in the RAID Group. So because there is an extra drive in the array a 4+2 RAID 6 RAID Group will give better read performance that a 4+1 RAID 5 RAID Group. When doing a full strip write the write speed between a RAID 5 and RAID 6 array will be basically the same. When doing smaller writes a RAID 5 array will have a faster write time than a RAID 6 array because RAID 6 has the extra parity to account for. The rebuild times for rebuilding after a failed drive will be about the same between a RAID 5 and RAID 6 array which have suffered a single drive failure. If the RAID 6 array has to recover from a dual drive failure it will take longer to recover than the single drive failure as the data must be recalculated from the two parity bits rather than from a single parity bit. However the odds of a dual disk failure are slim.
Just like with RAID 5 within the CLARiiON the RAID 6 supports the proactive hot spare. This is where when the system sees that a drive is going to fail it will automatically copy the data from the failing disk to a hot spare and mark the disk as bad. As the data does not have to be rebuilt this is a very quick operation.
FLARE 26 now supports a Security Administrator role. Members of this role have no access to the storage settings it self. They can only create accounts within the Array.
A very important change is that the SPs can now be setup to sync thier system time to a networked NTP time server. This will force the time on the SPs to be the same. Until now the times could end up getting a little off which could make tracking down event information very hard to do as the log entries would have different times on each SPs log file.
FLARE 26 now supports replication over the built-in iSCSI ports on the new CX3 line of systems. This is a great change as before you had to use the iSCSI ports on a FC-IP switch to do this replication. This includes SAN Copy, MirrorView, etc.
MirrorView /S should only be used for connections within ~100 miles as beyond that you start to get to much latency between the arrays.
Starting later this year (Q3 or so) there will be an extension to MirrorView /S called MirrorView /SE (Cluster Enabler) for Microsoft Cluster Service. This will give you the ability to use CLARiiON to setup a geographically disbursed cluster. In other words you can have servers in two different cities setup in a Windows Cluster.
I find myself with a little extra time between my last session of the day, which was a SQL 2008 Session, and the Grant Opening Reception, so I figured that a blog post was in order.
The SQL 2008 session was a what’s coming in SQL 2008. Most of the information was stuff that I already knew and have covered here in some form or another: FILESTREAM, new data types, etc. There was a new piece feature which was presented. The new feature it called the Remote Blob Storage Architecture. It’s not actually part of the SQL Server but the SQL Native Client (the SQL 2008 SQL Server driver). the RBSA is an API which lets the application write the file directly to the file system within the context of a SQL Server transaction. This way if the transaction is rolled back the file is deleted automatically as part of the client transaction rollback.
This is the first that I have heard of it, so I’ll have to find out more about and post about it. The only thing that Google can find is an MSDN forum thread from March 2008. There was some very ruff VB.NET code presented as an example. I’ll try and get my hands on that, clean it up a bit and see what I can figure out. As soon as I do I’ll get it posted.
If anyone has any additional information about the RBSA please feel free to post it. I’m sure that I’m not the only one who would like to see more information about it.
I’ve got 5 or 6 sessions tomorrow, but they are more focused on VMware and the EMC CLARiiON product line. So be sure to check out tomorrow if you are interested in those.
Here are some of the key points from the keynotes from EMC world.
Gartner says that by 2011 there will be 1337 Exabytes.
179 Exabytes of information has been created so far this year.
In the next two years we will see a trend moving further away from using Tape for backup and recovery. It will still be used for long term archiving, but day to day restores will come from disk.
During the raw data into useful information is becoming more and more of a challenge for IT departments and business units.
What I’m gathering from the keynotes is that the amount of data is exploding. I see this as a great thing for us DBAs as, the bulk of data that companies have is going to be stored within the database. I know that the company which I work for (Awareness Technologies) creates about 110 Gigs of information per day. Now with our products policy we keep this data for 14 days (unless the customer pays us to keep the data longer). This means that we have about 1.5 Terabytes of information within our data center that our customers are looking at on a daily basis. This much data, with such a high data change rate is just a crazy amount of data change.
Sites with high data change rates have specific challenges some of which I will be talking about (specifically how they relate to Microsoft SQL Server) on 6/11/2008 at 12:00 PM PST when I present a geekspeak session “geekSpeak: Spatial Data in SQL Server 2008 with Denny Cherry”. I’ll post the link to this webcast when I get it.
Since my laptop battery is dieing I’ll have to wrap this up for now. Check back later for more.
While waiting for the first keynote to start I figured that this would be as good a time as any to write a quick post.
I just got out of the first session of the morning. I decided to catch the VMware VI3 session which went over the new features of ESX 3.5.
It was a very good session packed with great info about the new features of VMware ESX 3.5 which came out a few months ago.
3.5 introduces the Storage Vmotion. It’s similar to the regular Vmotion which moves VMs from one host to another, but it’s designed to move a VMs storage from one disk (LUN, array, etc) to another disk within the same host without taking the VM down. In prior versions of VMware to move a VM from on disk to another you would need to power the machine down and then move the machine. It can now be done as a live operation. It does temporally double the memory and CPU requirements of the guest OS, and it is a CPU and disk intensive operation and should be done off hours, but it’s a great way to move from a full disk to an empty one.
Another great technology which I was introduced to is the VDM or Virtual Desktop Manager. As I understand this technology it allows you to take a single Windows XP guest OS, and allow many people to connect to it. The VDM places a connect pool in front of the Windows XP guest OS and as people connect to it, it takes a snap of the guest OS, and lets the new user use the snap. This allows many people to use a single Virtual Machine without increasing the amount of disk space required, and reducing the IO requirement of your Windows XP (or Vista) guests on the host.
As the Keynotes proceed I’ll try and go over what they are talking about.
Funniest comment of the concert would have to be “This is the first all you can eat buffet rock show I’ve ever been to.”
[kml_flashembed movie="http://www.youtube.com/v/er3xnXS_9vs" width="425" height="350" wmode="transparent" /]
The video was taken by my wife (who also had a great time, thanks for coming with me). What you don’t see (but you can here) is John Rzeznik talking to one of the people serving the food about 2 feet from the front of the stage. If anyone has better shots or video please post them and links to them.
I’ll continue to post from the conference as best I can.
I’ve added another picture so the group.
The message type is the first of the service broker specific objects. The message type defines the name of the message, and performs any validation that the service broker needs to do on the contents of the message prior to sending the message. If the message is being sent to another database (on the same server or a different server) then the message type must exist on both databases.
Validation can be one of four options.
- None (My personal favorate especially for sample code)
- Valid_XML With Schema Collection
1 and 2 are pretty straight forward. None = No validation is done. Empty = The message must be empty. Well_Formed_XML requires that the XML be a valid XML document. The fourth option requires that not only the XML be valid, but that it fits within a pre-existing XML schema collection (an XSD which has been loaded into the SQL Server).
The syntax to create a Message Type is pretty basic.
CREATE MESSAGE TYPE [MessageTypeName]
VALIDATION = NONE
That’s it. If you need to change the schema use the AUTHORIZATION clause to set the owner of the object.
When using the validation of NONE it is up to the receiving code or application to verify that the data within the message exists and is valid.
Next week I’ll be stepping into the lion’s den of sorts. I’ll be off to Las Vegas to attend EMC World. I went last year, and I’m pretty sure that I was about the only full time DBA there. The storage folks treated me pretty nice, even after they found out that I was a DBA.
Hopefully when I get back I’ll have all sorts of new storage tips and tricks to share with everyone.
If anyone else if brave enough to attend let me know here. Maybe we can meet up at one of the events.
Back in the old days one of the big reasons that people enforced such strict rules on data normalization was because it would greatly increase the amount of information you could fit on the hard drive. Back when everything was measured in kilobytes or megabytes storing redundant information simply wasn’t an option. You had to be able to cram as much information into that little database.
These days space isn’t the issue. Storage is very cheep these days, and if you have the money you can get absolutely massive amounts of storage. 1 TB hard drives are easily found, and when you start talking about RAIDing drives together the storage limits start to become just insane. 100 TB of storage just isn’t what it used to be.
The problem now becomes that with so much data for the database platform to search through we need to make sure that the indexes which SQL Server is searching as as small as possible so that we can get those indexes read from the disk and into memory as quickly as possible.
Say we have an Employee table which looks like this.
CREATE TABLE Employee
Now this table is clearly not setup as a normalized table. We are assuming that this table has best case indexing. If we have 1000 employees records and we are going to search the Department column the index must be read from disk. Now assuming that the department names are nice long bureaucratic names which average 30 characters each we have about 30000 bytes (more if you take the fill factor into account) which needs to be read from the disk. Now if we had a relational version of the table.
CREATE TABLE Employee
When we search the Employee table we now are only loading an integer value from the disk into memory. This means that we only have to load 8000 bytes of data from the disk into memory. Now knowing that SQL Server reads data from the disk in 8k blocks even with a 50% fill factor the index only takes 2 blocks on the disk and can then be completed in two read operations to the disk. Now our original table with no fill factor will require 30 data pages (20.297 to be specific), about 60 if we use the same fill factor.
With tables of this size this isn’t much. But as your database grows the number of additional data pages which would have to be read for a simple search of the table grows exponentially.