I just returned from the Novell BrainShare 2007 conference in Salt Lake City, and I have to say that I was very excited about the amount of attention that virtualization received at the conference. Here are some of the highlights:
- Novell and Microsoft partnership – both Microsoft and Novell representatives co-presented on both virtualization and directory service integration
- Plenty of talk on paravirtualized device drivers – with PV drivers, Microsoft Longhorn Server virtual machines will run at near native performance on Xen running on SLES 10 SP1. With the planned official support for Windows 2000/2003 PV drivers, Xen on SLES 10 SP1 is emerging as a serious choice for virtualization.
- Failover support for Xen on SLES 10
- Virtualized NetWare 6.5 support in Xen
- Cool management on the way – ZENworks Virtual Machine Management (beta coming soon) offers centralized management for VMware, Xen, and Microsoft virtualization engines
I have always been a big proponent of dynamic failover support when it comes to running virtual machines in production environments. With Heartbeat 2.0 integration, Xen VM failover support will be a part of SLES 10 SP1. I dug a little deeper into the heartbeat integration and currently failover will progress in the order of cluster node names. If a target node does not have the resources to support an additional VM, then the VM will fail over to the next node in the cluster (and repeat the process until it has found a suitable home). Novell engineers are working on better automation for failover, so a VM’s first failover target will be a physical host system that has the capacity to host the VM’s required resources. If you’re planning to build a 2 node Xen failover cluster, then this is really no big deal. However, if you’re planning an 8 node cluster, you’ll definitely want tighter control of the failover process. Still, this has been a big year for Xen, and I would not be surprised if Novell’s Xen failover automation isn’t rock solid by the end of the year.
On my Novell Xen wishlist…
- Migration tools – I would love to have a tool that automatically converts a physical NetWare 6.5 server into a virtual machine. If Novell will not offer a migration tool, I’m sure that a vendor such as PlateSpin would love to jump in and help.
- Improved failover (see above)
- Consolidated backup support – I would love to see an answer to VMware’s VCB. Give us a well-documented backup scripting API and integrating Xen backups into enterprise backup software backup jobs will be a piece of cake.
- Common management APIs/metadata – It would be much easier for all of us (admins, ISVs, etc) if there was a single common management API set for all virtualization platforms. I’m hopeful that a common management API set will be produced as a result Microsoft/Novell partnership. However, getting all of the major virtualization vendors to agree on a common format would open plenty of new doors in terms of more robust backup methodologies, centralized management, and reporting.
I’m sure that time will tell whether or not my wishes are granted…
From the desk of totally unimportant and frivlous items (also known as my inbox) came this timely bit of news in the VMTN Technical Newsletter:
“The new VMTN front page gives a dynamic view into the activity on the site and in the VMware community. Keep up to date on the latest in VMTN News, Virtual Appliances, Technical Resources, Discussions, Knowledge Base, Compatibility Guides, Security Alerts, VMware Blogs, and Virtualization Blogs. The page is updated throughout the day.”
Ok, so it wasn’t technical. It was informative though, and I do like the new layout. I always had a problem with how difficult to navigate the old VMTN site was, how it was hard to go from one place to another without crossing through a third place that I didn’t really care about. Me, I like the forums and the virtual appliances, but I’m also getting to like the community-centered this-hardware-works-on-VI3 section.
And tonight on Friday Night Company-Fights:
The Undercard: AMD vs. Intel, Windows vs. Linux, and Pepsi vs. Coke.
The Main Event: XenSource vs. VMware.
Ok, here’s my beef. I hate all industry wherein marketing rules over substance, and in this case, I’m calling out both XenSource and VMware for being pig-headed and small-minded.
XenSource – you posted test results of a BETA. A product that is, by definition, not ready for prime time. A product that still needs work. That ain’t done. That’s still raw in the center. Can I say it any other way? My constructive criticism is this – wait until you have posted the mature version that is available in it’s production form and then do the proper benchmarking. Don’t get me wrong here, Xen is a great product, but reacting to VMware’s get-your-goat inflammatory benchmarking is rediculous. All XenSource looks like now is another marketing-driven company that is more interested in fighting perceived “Cola Wars” than in putting out a class-A product. Benching a beta just looks cheesy, and worse, sneaky.
VMware – Those were dirty benchmarks and you know it. You didn’t create a proper test between proper versions, under neutral conditions. And your EULA… only when if became obvious that the problem was public did you give XenSource permission to test your product. You need to drop that contingency against publishing benchamrks. It’s sneaky and cheesy too. Yes, you’re not the only ones to do it, but that doesn’t make it right. While you’re at it, why not post meaningful benchmarks instead of trying to raise the heat on Xen. This can only help them, your competition, to get more publicity. And now that it’s out that the benchmarks weren’t fair, if makes VMware look bad.
I adore VMware. I think Xen is great too. I think in this case both of these companies stand to lose credibility, not gain market share.
Recently posted to the VMware web site is this guide to configuring your SAN for maximum virtualization efficiency (wow, that almost sounded like marketspeak… help me Obi-Wan help me!). It’s an excellent resource on both VMware architecture and SANs in general, containing a copious section on what a makes a SAN a SAN. For anyone who doesn’t know, it’s a Storage Area Network – a way to take a big honkin’ system (or systems) with lots of disks and share them to your servers, which will think they are the same as physically attached disks. The guide goes on to discuss different kinds of SANs and how to configure them to work best with VMware’s various utilities. Failover is also discussed, both from the SAN and the VMware side, as are some aspects of optimization for performance. There is also the obligatory mention of NAS support (NFS 3 only) in VI3, a first for the ESX product line (VMWare used to support it in earlier pre-ESX products, the descendents of GSX/Server).
Most of the reason that VMware published this document can be summed up by this quote from page 130:
“Many of the support requests that VMware receives concern performance optimization for specific applications. VMware has found that a majority of the performance problems are self-inflicted, with problems caused by misconfiguration or less-than-optimal configuration settings for the particular mix of virtual machines, post processors, and applications deployed in the environment.”
I have to admit, that had me laughing. It was the whole “blame the user” mentality that I found funny – I’m glad VMware put the paper out there, but really, they had to expect that the 80/20 rule of troubleshooting would apply to them too – 80% of all problems are human error. The guide does a good job of helping avoid those pitfalls, and goes into detail on setting up your SAN to perform well.
After perusing this document a bit, I’m going to stick with my anti-fibre-channel stance by saying that it’s just not worth the trouble to deploy new FC SANs for a VMware deployment. I’d stick with an iSCSI SAN or NFS NAS if you want the full benefit of shared storage and don’t already own FC SAN gear. Now I have to admit tht I’m biased here… I managed a SAN environment at one point in my career, and I hate Fibre Channel SANs with a passion that rivals how the Red Sox fans and Yankees fans feel about one another (except I don’t think EMC SANs hate me… at least not like human hate anyway, and if they did, I’d have to consider checking into an alternative cognitive function facility, aka the nuthouse).
Another reason I stand against rolling out new FC SANs for VMware is this article by SSV’s News Director Alex Barrett, in which EMC VP Charles Hollis calls for NAS as the best choice for VMware environments. I tend to agree, provided that a number of recommendations, also in the VMware SAN guide, are followed. First among these – forget sharing the storage network with anything else other than VMware. In fact, put it on a completely different set of equipment if you can, just to avoid any processor overhead that VLANing with the same network hardware may incur. It’s gotta be gig, too. That’s in the basic VMware VI3 docs, and repeated in the SAN guide.
The optimization hints consist of a mix of technical and non-technical advice, some of which would generally be overlooked by a SAN admin, and some of which would be overlooked by a VMware admin, such as:
“Choose vmxlsilogic instead of vmxbuslogic as the LSI Logic SCSI driver. The LSI Logic driver provides more SAN-friendly operation and handles SCSI errors better. “
“No more than 16 virtual machines or virtual disks should share the same physical volume.”
“Enable write-caching (at the HBA level or at the disk array controller level)”
There are also equally obvious dummy-errors that are mentioned, things that must happen in real life, but for the life of me seem so stupid that only people who WANT to be fired would do them. My favorite:
“Optimize disk rotational latency for best performance by selecting the highest performance disk drive types, such as FC, and use the same disk drive types within the same RAID set. Do not create RAID sets of mixed disk drives or drives with different spindle speeds.”
This is saying the following – Don’t mix 72gb 10k rpm drives within the same RAID array as 72gb 15k rpm drives. And don’t put a 72gb drive in with 144gb drives. And for pete’s sake, if your SAN supports mixed drive types, don’t ever, ever, EVER mix SAS drives and FC drives. Duh.
As for what this document is not – it is NOT a howto guide to configure VMware many applications in a SAN environment, beyond the direct purview of shared storage. There’s no guide to setting up VMware HA/DRS, though there are several pages dedicated to the storage aspects of these products, including how multipathing your HBAs can affect HA and DRS. Thats left for more product-specific papers, presumably because there’s no reason to be redundant.
Overall, the paper gets 8 pokers.
In this blog entry, I passed on system administrators’ complaints about the difficulty in tracking virtual machines in their large companies. The fact that this is a problem surprised me and also surprises others.
For example, on his blog, Tarry Singh asks:
“What kind of a manager are you anyways to not have a track of the machines (Virtual or Physical) in your environment?”
You’re not an unusual manager, it seems. Tarry was responding to an article in The Register titled “How many VMs are on your LAN — and how sure are you?”
Tarry thinks the Register story is a sales pitch to sell yet another auditing software. I don’t agree. I think that virtual machines are so easy to deploy that IT-savvy employees are creating VMs for their departments.
Somewhat in jest, blogger Dirk Elmendorf wrote that with virtualization:
“Now I can set up my own pet network independent from the watchful eye of IT.”
Kenny Scott responds to that blog, saying:
“Clear policies in the workplace are all that is needed to combat workers installing new Windows boxes on virtual instances, because it’s not any harder to install Windows on a virtual instance than it is on an old desktop that you want to use to do a bit of testing.”
Way back in 2006, Gartner analyst Tom Bittman told us that tracking VMs would be a big problem. In that article, he said:
“It’s a different beast with physical servers. Although server sprawl is always hard, at least you can point to a physical server and know it is there. With a VM, it is a lot easier for it to get lost.”
So, it seems surprising that IT managers didn’t anticipate this problem, but, obviously, some capitulated to the demand for VMs and deployed first without planning. I’m sure that a bunch of IT managers didn’t fall into this trap. However, I bet quite a few are grappling with rogue VMs in their organizations.
I’d like to hear from managers who have a solid VM-tracking plan. How did you do it? Got any VM-detecting tips up your sleeve? I’d also be interested in hearing from those who are having problems. Please share your experiences in a comment to this post or via my email, email@example.com.
The editors of SearchDataCenter.com, SearchEnterpriseLinux.com and SearchServerVirtualization.com would like to know more about your server and virtualization decisions in order to better serve your needs. Take this brief survey to help us understand your plans for choosing and implementing various types of servers and how you’ll use virtualization.
Take our survey now and let us know, and you’ll also be entered to win a $200 Amazon.com gift certificate! Simply include your email address at the end, and we will select a winner from among the pool of respondents.
With your help, we’ll find out more about your server decisions, so we can provide information on our sites and at our events that better fits your needs. Thanks in advance for taking the time to help us.
Unless you live under a rock, by now you’re probably heard about VI3. But have you seen it in action? This “short” (ha) 20-minute long video I found on YouTube shows you exactly what it does.
It features a VMware “guru” and a virtualization “newbie” who asks every possible question you could think of. It’s actually a pretty decent video. Check it out here: VMware Infrastructure 3 demo. (I was going to try to embed the video, but this blog won’t let me… yet.)
While we’re on the subject of videos, I found another good VI3 video-this time about upgrading. Why should you upgrade? Find out here: VMware Virtual Infrastructure 3 Upgrading. The speaker is a little dry, though.
One of the overarching questions I’ve had since I started covering virtualization is how will it influence the kinds of server purchases IT managers make? Is it better to buy several small, slim servers, e.g., blades, or a single large and beefy one? Now we know. Virtualization is prompting IT managers to buy fewer larger boxes, richly configured with multi-core chips and oodles of RAM. So much so that yesterday, the venerable market research firm IDC did something it seemingly never does: changed its server sales forecast in a downward direction. An article on ZDnet states:
IDC on Tuesday lopped 4.5 million units off its forecast for the number of x86 servers to ship in the second half of the decade after concluding that virtualization and multicore processors are cutting into purchases.
That 4.5 million number is a major change–about 10 percent of the servers the market analysis firm had expected would be sold from 2006 to 2010. In addition, the firm trimmed its spending forecast by $2.4 billion.
But at the same time, I’ve had countless conversations with executives from the first- and second-tier server vendors that virtualization remains a key area of focus for them, that sure, what they lose in quantity of servers sold, they’ll make up for in quality of servers sold, blah blah blah. Now, I’m no MBA, but ten bucks says that the IBMs, Suns, Dells and HPs of the world are going to find ways to offset their losses. Need lots of memory? Great — but don’t expect any huge price reductions on 4GB DIMMs. Need more I/O? Don’t just add another Gigabit NIC, why don’t you upgrade the whole kit-and-kaboodle to 10Gig Ethernet?! You get my drift…
I’m a big fan of free… free as in beer and free as in speech. Sometimes that even means free as in ad-supported. NOT Adware-supported, mind you, but ad-supported free software runs second in my book to truly free open-source software. Anyone remember Pointcast? Yeah, it was a bandwidth hog in an analog age, but I LOVED it. Knowing that, you can imagine how many of the F/OSS systems management products I’ve tried. The answer is: Enough to speak on the subject at Data Center Decisions, if not speak well (hey, first time on that end of the podium… but thats another story). To get back on track, I even liked a lot of them too, Nagios, Zenoss, Hyperic, and even Groundworks new version (though Andrew Kutz knows how much I loathed their previous version from our talks at Data Center Decisions, I’ve since changed my tune) made my short list on the OSS side.
What I was looking for:
- The ability to scan the network at set intervals, creating and maintaining a detailed scan-based inventory.
- WMI-based tools to get detailed software, services, and other information from desktops and Windows servers.
- An SMTP- and/or SMS-aware alerting system that would email and/or text my phone when the poo hit the fan.
- Rudimentary ticketing so that when one of those alerts come, I have a system to manage them by.
- The ability to monitor VMware virtual machines, and manage sprawl.
Eventually I settled on using Spiceworks. It’s free, but not Open Source. There’s no Linux version, which would normally kill me because I don’t like paying Microsoft for an operating system when I’m trying to use something free, but the resource useage is low enough that after testing I put it on an already existing file server. They all did this, but the simplest to set-up and use was another application, Spiceworks. It was quick, simple, and does everything I want. The helpdesk system in 1.5 is simple enough that I may migrate from our current software over to it. Jury’s still out on that, since the helpdesk Web portal piece trusts user input (by typing in your email address) about identity, rather than authenticating, and I’m not sure about HIPAA implications. It’s not a medical system, but it could be misused to put in fake tickets about medical systems, etc. etc. Anyway, I looked over what the ad-supported system sends out, what the pricacy policy is, and decided it was worth using since it doesn’t compromise any private data, and the ads are inobtrusive. Ok, long story short… it does a nice job identifying hardware, including virtual machines. Some short screen caps follow:
This is a virtual machine sitting on VMware Server 1.0.2. I use VS on desktops for some of our legacy apps that need (gasp) Win98, so keeping tabs on who’s making more VMs and sucking up their resources (not to mention adding to sprawl) is key. People like to play, and it’s not always as easy to lock them down as you would like. VMware-based hardware shows up like real hardware if you click the configuration tab (I won’t post the image here, at least until I edit out some serial numbers and other proprietary stuff), and the details go much further into the machine’s info. It also manages linux boxes (granted, without WMI, not as much info is gleaned, but there’s still lots of useful data.
Here’s the really useful part – regular scans, plus the ability to pick up virtual machines like they were any other machine. IOW – the ability to control virtual machine sprawl and manage documentation for vms.
Next up, once I’m done playing with Virtual Iron and have some nice Xen VMs, is to try Spiceworks out and see how it detects and documents Xen-based virtual machines. Should be a nice synergy of tests.
“Whoever comes up with a solid virtual machine documentation process will make a killing this year.”
Those words were spoken — off-the-record — by a senior systems administrator for a major utility company. I ran into him at the Red Hat RHEL5 release party this week. The subject of virtualization was in the air, and — spontaneously in casual conversations and without any prompting from me — four separate sys admins (who asked to remain anonymous) complained about their virtual machine documentation problems.
A seasoned admin — a mainframe expert — with a major financial institution said that many VMs were being deployed in her company’s data centers and departments and no workable tracing mechanisms were in place. A sys admin for a telecommunications company said it took a team of three people doing nothing else two weeks to track down all the VMs.