Yesterday was Day 3 of EMC World, and there were more great sessions packed full of technical information. Yesterday was also the last day of the exhibit hall being open so it was Apple iPad giveaway day as well, sadly I didn’t win one.
The first session that I want to recap here was the SAN meets NAS sessions that I attended. One of the big takeaways from EMC World was the technology that EMC is putting into all of the mid-tier products. This includes the EMC Celerra which is EMC’s Network Attached Storage product. The Celerra is basically as EMC CLARiiON with no fiber ports and a NAS connector on it, with a lite version of NaviSphere Manager running on it (unless you get the gateway only, and you then present LUNs from another fiber channel storage platform). What the FAST package lets you do is to have the hardware automatically move less used data from expensive storage to cheaper slower storage. This allows you to keep the data online so that your users can access it, but the access times will be just a little bit slower. Instead of having a 5ms response time it may have a 50ms response time, but just for the files which are older and haven’t been touched in a while. Continued »
So yesterday was Day 2 of EMC World, and my body is really starting to feel it. All the sessions today were top notch sessions.
The first session yesterday was by Bruce Zimmerman. For those of you in the SQL Server community that are reading this Bruce is to the EMC Storage community as Bob Ward is to the SQL Server Community. Bruce talks on EMC CLARiiON performance tuning every year at the 400-500 level. Here are some of the highlights.
When using NaviSphere Analyzer to monitor the utilization of your array you may not be getting the correct picture. Metrics such as utilization are not true measurements of the utilization but instead a calculation. In the case of the utilization metric Analyzer simply looks at the utilization of the RAID Group that each Storage Processor (SP) is putting on the RAID Group and the higher of the two numbers is reported. If you have a single LUN on the RAID Group, or all the LUNs on the RAID Group are owned by a single SP this isn’t an issue as the other number will be 0, but if the LUNs on a single RAID Group are owned by both SPs and each SP is running the RAID Group to 40%, Analyzer will show a 40% load, instead of an 80% load on the RAID Group.
If you need to dump Analyzer data to a CSV file via the naviseccli command use the -archivedump switch. (Someone asked me about this via twitter a while back, which is why I made sure to include it.)
If you monitor the performance of your Storage Processors you may see the CPU spike to 100% on regular intervals. This interval will correspond to the data logging interval that you have set within NaviSphere Manager. While this CPU spike may worry you, unless your normal CPU load on the Storage Processor is very high, this CPU spike will not effect your performance through put. If you are concerned that it is affecting your performance throughput through the storage processors, try disabling the data collection for a period of time in the SP properties.
If you look at the NaviSphere properties for the array you’ll see two settings for data logging. One for the background process, and one for live data capture. If these settings are different then the data logging happens at the lower of the two intervals. Most people should set both of these options to 300 seconds unless you need capture data more frequently than that for a specific reason.
Some improvements to the FLARE version 29 that you’ll notice is that the load placed on the storage processor by the data logging process has been reduced by about 80% which is a huge savings. You’ll also notice that with Release 29 that when doing a non-disruptive update (NDU) the CPU on each storage processor has to be slow 65%. In older versions the CPU load had to be below 50%. This change was made because the amount of backround processes which the array is performing as background management processes can be about 7-8% (per SP) and these processes don’t fail over.
Another naviseccli trick is to include the -np flag for all your commands. This will tell naviseccli not to poll the array for response information. Now if you need to get back information from the array when you run your commands you’ll want to include this. For example if you create a LUN and have the array assign the LUN id and you want to do something with the LUN id later in the script you’ll need to exclude the -np switch, however if you specify the LUN id and don’t care about the feedback including the -np flag will save the Storage Processor quite a bit of work as the CLI requests a good deal of information from the Storage Processor for each CLI command that is issued.
I also gathered a lot of information about VMware in other sessions yesterday.
I’m not sure if this was supposed to be released, but the next release of vSphere (aka ESX) will be in Q3 of 2010 and will be vSphere 4.1. This next release has a lot of enhancements to ease administration and improve integration between ESX and the EMC storage arrays. You can assume that all of these integrations between vSphere and EMC CLARiiON arrays will require FLARE release 30 which should also be coming out in Q3 2010.
The first improvement is the vStorage APIs. This is a set of APIs within vSphere 4.1and the EMC arrays that allows the vCenter server or the vSphere server it self (if not running with a vCenter server) to talk to the array directly and perform some actions.
These actions include Bulk Zero Acceleration. This allows the vSphere host to when creating a new file to tell the array to fill the file with 0s instead of having to transmit all those zeros to the array over the fibre or iSCSI. This is done by the vSphere host writing a single block to the array with all 0s in it, then telling the array to replicate that block n number of times. While this won’t reduce the amount of data that the array has to write, it will reduce your network traffic and because of this may safe time. By default this feature will be enabled in vSphere 4.1, but can be disabled in the advanced settings page of the host.
Another feature are some hardware locking changes. Currently when vSphere needs to take a lock on the LUN it locks the entire volume then performances it’s operation then releases the lock on the LUN. In vSphere 4.1 it will be able to lock just the specific block on the disk that it wants to work with, then release just that block. This will allow multiple hosts to take locks on the same LUN at the same time without having to wait in line to complete the operation. There are a few places when this benefit will be seen including boot storms (where you’ve got lots of machines booting at the exact same time), and allow for more snapshoting to take place (as when each snapshot is created a lock has to be taken on the LUN when the new file is going to be created). By default this feature will be enabled on vSphere 4.1, but can be disabled in the advanced settings page of the host.
The next feature is called Full Copy Acceleration. This is a great feature which will reduce the amount of traffic between the array and the host when cloning a virtual machine. Today when you clone a file the file is copied up from the array to the host, then written from the host back to the array in the new location. With this feature enabled (which it is by default) the API will simply tell the array to copy the blocks which make up the file from one location to another preventing the entire file from being transferred from the array up to the host. If your network between the array and the host is bandwidth limited this will reduce the time it takes to clone the virtual machine.
Of the new VMware Features which require array integration there is only one which doesn’t require FLARE 30 and that is the Stop and Resume feature, which requires FLARE 29 on the host. This feature cleans up the way that the guest OSs see that a thin provisioning pool is out of space and the LUN can’t consume any additional space. Prior to vSphere 4.1 (also known as today) if a thin LUN can’t be expanded as needed on the array because there isn’t any space the guest OS will throw (within Windows at least) a blue screen of death (BSOD) because the page that it’s requesting to write to isn’t available. In vSphere 4.1 an error message will be thrown as a popup within the guest OS which effectively says that there was a problem writing to the disks.
Something which will be coming in Q2 of 2010 (so probably within the next 6 weeks or so) will be the CLARiiON Provisioning Plugin for vSphere. This will let you provision a new LUN on the storage array, and attach it to the VMware Cluster from a single screen which should greatly decrease the amount of time required to provision and attach storage from the array to the server.
I’m curious to see how long it takes other storage vendors to get these APIs working on their arrays (with or without VMwares assistance).
Check back tomorrow for my Day 3 post.
So yesterday was day 1 of EMC World. I attended some great sessions (and one not so great one).
The first session that I hit was the futures for the EMC CLARiiON’s FLARE software. For those that don’t know what FLARE is, FLARE is the software which runs the array and handles all the functions of the Array. The next release will be FLARE v30. If you are a CX3 or older customer this new release will be of no use to use as this version only supports the CX4 array.
Some of the new features which are being included are a totally new management interface called Unisphere. This new interface will give a single interface for your EMC CX arrays as well as your Celera devices and RecoverPoint. Eventually other EMC products will be integrated into Unisphere with products such as Replication Manager coming hopefully in 2011. Continued »
Yesterday was Day 0 or EMC World which means that it’s party day. The day started with Registration and the Welcome reception. If you’ve never been to EMC World, registration is probably the longest line in the place. You’ve got all 10,000 or so attendees trying to get checked in. Fortunately for me I’m a returning attendee so my line was much shorter than the general registration line (thank god).
After the welcome reception was the concert featuring the Counting Crows. The Counting Crows put on a pretty good show, so far I’d have to say that the Bare Naked Ladies are still my favorite concert at EMC World so far.
I took a bunch of pictures at the party and registration which I’ve posted to Flickr.
Probably my favorite picture of Day 0 is this one of me with the walking Celerra.
I’ll try and post sessions daily about everything that I’ve seen though out the sessions.
My trip to EMC World started a day early this year. With the long trip out here, and the events starting pretty early on Sunday afternoon, trying out on Sunday just doesn’t work unless I want to sleep through the Sunday welcome reception and check in.
And since I don’t want to sleep through either of those events, I came out on Saturday.
On the way from the airport to the hotel I was able to snap a few pictures which I figured that I’d share with everyone. Below are a few of them. Feel free to click through above to see all of them.
The view from the plane was pretty nice for most of the flight.
We were greeted at the airport with some EMC World signs before we even got out of the baggage claim area.
In the cab from the airport to the hotel we got a view of the river going through the city.
Granted these aren’t the best pictures that I’ve ever taken, but I took them on my camera phone so I think they came out pretty good.
Something that some companies like to do is to change the port number that the default instance is listening under as a security precaution. However this has a habit of stopping anyone from connecting to the default instance without knowing the port number.
This is because the default instance doesn’t register itself with the SQL Browser when it starts, so you can no longer simply just connect to the default instance when named instances will work just fine. The fix from CSS is to simply change the port back to TCP 1433. The reason for this is that changing the port number doesn’t do a lot to secure your SQL Server as a quick port scan will show an attacker which port the SQL Server is listening on. That or they’ll simply check out the web.config and get the port number from there.
So last week was SoCal Code Camp and I just released that I hadn’t gotten the slides posted for the sessions that I did.
If you didn’t submit a session survey at the Code Camp, please go to http://speakerrate.com/mrdenny and fill one out.
The nice folks at Tech Target were nice enough to send me some swag to give away during my SoCal Code Camp sessions in June. They sent some stickers (and who doesn’t love free stickers), a few t-shirts and a bunch of cable holders.
For those of you who attended the SSWUG vConference a couple of weeks ago, hopefully you caught my sessions on Day 1. A good number of the attendees submitted evals, and I figured that I’d go ahead and share the scores with you. Continued »
For those of you who are interest in getting a second shot voucher from Microsoft, drop me an email mrdenny AT mrdenny.com with the email address that you want the voucher sent to and I’ll kick one over to you. If you want to take more than one certification exam let me know how many you plan on taking before the end of June.
These second shot vouchers can be used for any Microsoft IT Professional, Developer of Dynamics Exam (anything which starts with 070). The second shot vouchers can not be used for the academic exams (072 and 073 exams).
Each voucher is valid for one exam retake (if the first attempt failed) and one Practice Test discounted 40% before June 30, 2010.
I’ll try and remember to print some up and bring them with me to SQL Saturday in case anyone there wants one.
If you are on twitter feel free to send me a DM and I’ll email the vouchers to you.
Good luck on those exams,