Recently, I chatted with a sys admin about his experience with VM migrations and management challenges. His primary goal right now is creating backup copies of his VMs as part of his disaster recovery plan. The biggest hurdle? Licensing. Or rather, the cost of licensing, because he wants to avoid the cost of treating each VM like a physical box.
Right now, he’s backing up the VMs that don’t need to be online 24/7 by shutting down his guest OSes, backing up the virtual hard disks using Backup Exec, and restarting. Costs are less for backing up in this fashion, because it only takes one Backup Exec license to backup the files on the host.
But for his mission critical VMs, he’s stuck between a rock and a hard place. Take the VM offline, install the Backup Exec Client and pay the license fee for *each* VM (which could get pretty expensive pretty quickly), or… don’t back up.
He’s starting to research snapshots as a lower-cost solution for his mission-critical VMs, but when I last checked in hadn’t gotten too far in the process. Any suggestions? Leave a comment and I’ll pass your suggestions along.
Ok, this blog is normally about server virtualization, but I thought I might digress into a slighly-off-topic realm today, to bring some opinions on a product I’ve been using for some time, called Parallels Desktop. This is the program that gets the little text box at the bottom of the “Run Windows and Mac” commercials. You know the ones, John Hodgeman, with the many-titled commentator/reporter from The Daily Show and Justin Long, the guy from about a million small roles. I’ve been using it for some time, having the need to run such things as Visio and Access, as well as my custom MMC for all our Windows server management, and VirtualCenter to manage our VMware environment from my Mac. Parallels is one of the two ways to get windows working on your mac, and until the recent release of VMware’s Fusion beta, the only virtualized way, and I use it all the time. I don’t play the upgrade game with applications that I need on a daily basis, so I’ve been running a slightly older build for a while, at least until my curiousity about Coherence got the better of me.
So, I fire up my XP machine after the upgrade, and click the button for Coherence. What do I see – A Control-Alt-Delete box in the middle of my Mac. Hilarity. True hilarity. Function-Control-Alt-Delete and I’m in. Yuck. My first non-enjoyable part of Coherence… lets hope that it’s my last. The Windows taskbar has just invaded my Mac desktop. Start button, quick launch, and system tray… my favorites (not!). Easy to dispel… just a simple click in the options box and they’re gone. Getting back to the Start button is easy – just a double-click on the parallels icon and I have my menu. Some re-arrangement of what I have pinned there vs. what I used to keep in my quick launch tray, and I’m ready to roll.
Windows command prompt. Adobe Designer. IE7. All working. Everything is working. My shared folders between the virtual and the physcial mac means I can move docs back and forth. What about drag and drop? Works like a charm. It even pops up a box that lets me choose what access I give Parallels. The whole kit and kaboodle looks like this. You can see that the app creates icons in my dock, just as if they were Mac programs. You can see a couple of Mac-native apps running, like iChat, Grab, Thunderbird, and Firefox. I fired up my VMware Server management console to connect to the test lab, and sure enough, it worked great. It feels native. It’s seamless once I’m past that warning box. It’s simplicity is brilliant.
What it means for the deskop is obvious. It means no more worrying about running legacy apps. Got a must-use app for Win9x? Need to run a Windows app on a Linux desktop, and WINE can’t run it? All possible… not with Parallels specifically, but the technology in general. It won’t be long before other products are out there that do these things, that take advantage of this huge conceptual leap in client-side application virtualization.
Imagine a Linux client, talking to a Citrix server that is hosting Mac and Windows apps, and sharing them via http. Wow.
Off-topic a bit? Yes. But keep this kind of converged (Coherent?) virtualization approach in mind as the line between operating systems continues to blur in the server market. Will we one day see one server serving applications to end-users, customers, etc. from a virtualized environment similar to Coherence? I wonder what this means for streamed applications, like those pushed out via Citrix? Will Citrix take advantage of this kind of technology in its own app virtualization products? One can only hope. What will this do for sandboxed applications? It’s a bit off in the future, but expect this sort of innovation to make it’s way upstream in any number of ways.
It’s great to see VMware finally embrace Paravirtualization. As a result of a tremendous community effort to develop a common interface between Xen and VMware, that benefited from the collaboration of VMware, IBM, Red Hat, Novell, XenSource, Intel and many kernel.org contributors, the first common API between the two hypervisors will appear in Linux kernel 2.6.20. It’s a pity VMware’s PR didn’t acknowledge the community contribution though…
Here’s what’s going on: The issue at hand is Paravirtualization – a technique of modifying an operating system so it will run optimally on a hypervisor. The best known example of a paravirtualizing hypervisor is the Xen hypervisor, but the concept originates from IBM mainframe OSes from the late 70′s and has been widely used on vertically integrated (hardware plus software) systems for some time. Paravirtualization was first introduced to the x86 architecture by the Xen project instead of the binary patching technique used by VMware, because (as VMware recently acknowledged) it offers significantly improved performance for virtualized guests.
But Paravirtualization has a bit of a downside – it requires modifications to the guest operating system to enable it to co-operate with the hypervisor. VMware gets around this today using binary patching, which modifies the guest “on the fly” by rewriting the code. Intel VT and AMD-V help a lot- but not with I/O. The performance benefits of paravirtualization have led all x86 OS vendors to adopt paravirtualization for their next major OS release (though Microsoft calls it “enlightenment”). Xen-style paravirtualization also allows OS vendors to ship the hypervisor with the OS – something VMware understandably isn’t that keen on. In the case of Linux and Solaris this is achieved through the inclusion of the Xen hypervisor, and in the case of Microsoft the forthcoming enlightened Longhorn Server OS will be augmented at some point with the Windows Hypervisor, which is architecturally very similar to Xen.
Today many distros are delivering Xen as an embedded hypervisor. The Xen hypercall API paravirtualization hooks are added by the vendor to their kernel once they select a particular version from kernel.org. This is tedious/painful, and the obvious right way to do this is to have the hooks included and maintained by kernel.org. The Xen project was happily working away (albeit rather slowly) to get the Xen hypercall API upstreamed to kernel.org, when VMware introduced VMI at OLS in 2005. VMI is a lower level interface than the Xen hypercall API. It’s much more suitable to a binary re-writing hypervisor like VMware’s. But it deserved serious consideration because it offered a useful new feature – the same kernel could run native and virtualized. But VMI is closed source – an ABI not an API, which is a serious problem for many in the open source community. Everyone agreed that having a single interface for multiple hypervisors would be preferable to having many. So, at the Ottawa Linux Symposium in 2006 the Xen project began to work with VMware to develop a common set of kernel hooks that could accommodate the VMI ABI and the open source Xen hypercall API. Since then, there has been a very positive effort on all sides, with IBM, HP, Red Hat, Novell, and many other core kernel.org developers playing a key role in getting the work done.
So, what’s in 2.6.20 is a common API called paravirt_ops, developed collaboratively by a group of contributors, and the first implementation of the VMware VMI interface into paravirt_ops comes in 2.6.21. The Xen interface into paravirt_ops should follow shortly, likely in the 2.6.22 time frame. The Xen API is more extensive than VMI, and the work is taking a bit longer to get done. Once this set of changes is complete, future Linux kernels will have the paravirtualization hooks built in, which will dramatically simplify the kernel development processes of the distros.
The bottom line: Future Linux kernels will have a common hypervisor interface called paravirt_ops that will allow Linux to run on either Xen or VMware with high performance. Through XenSource’s relationship with Microsoft, it’s reasonable to expect that these Linux kernels will have the ability to run as first-class “enlightened” guests on the future Windows Hypervisor. Of course, all of this is only relevant to the market when the next major enterprise Linux distributions take new kernels to market that include paravirt_ops, but overall it is good to see harmony emerging in this particular piece of the virtualization landscape.
The road from physical to virtual servers isn’t a freeway…yet. There some potholes that hold up P2V migrations. Here are a couple of views from those who’ve taken a trip with VMware Converter.
Language support issues on VMware Converter caused several P2V mishaps for Robert Sieber of SHD System-Haus-Dresden GmbH in Dresden, Germany. Responding to my blog entry on physical-to-virtual (P2V) migration mishaps, he wrote:
“Mainly all of our tries to migrate from physical to virtual or virtual to virtual failing at 97%. It looks like if there is an issue with the language of the underlying OS. We used German OSes for installing VMware converter and now we are trying to use only English ones. Since we switched to English success rate is somewhat better.
“I really hope that the people who developed ESX server are much better than the one who developed Converter and Capacity Planner.”
Blogger Scott Lowe had mostly good experiences with VMware Converter. Check out his trip through online and CD-boot experiments on his blog. Lowe liked VMware’s network throughput of about 8-9GB per hour and its ability to import directly to VMFS on ESX Server farm with no need to use a helper VM or vmkfstools. On the other hand, he had trouble logging in to VirtualCenter and had to connect to back-end ESX server instead. Booting up seemed to take more time on VMware Converter than on VMware’s older tool, P2V Assistant; but, he says, “this is a very subjective assessment.”
What are your objective or subjective opinions about the state of P2V migration tools? Please comment here, or email me at email@example.com.
I just returned from the Novell BrainShare 2007 conference in Salt Lake City, and I have to say that I was very excited about the amount of attention that virtualization received at the conference. Here are some of the highlights:
- Novell and Microsoft partnership – both Microsoft and Novell representatives co-presented on both virtualization and directory service integration
- Plenty of talk on paravirtualized device drivers – with PV drivers, Microsoft Longhorn Server virtual machines will run at near native performance on Xen running on SLES 10 SP1. With the planned official support for Windows 2000/2003 PV drivers, Xen on SLES 10 SP1 is emerging as a serious choice for virtualization.
- Failover support for Xen on SLES 10
- Virtualized NetWare 6.5 support in Xen
- Cool management on the way – ZENworks Virtual Machine Management (beta coming soon) offers centralized management for VMware, Xen, and Microsoft virtualization engines
I have always been a big proponent of dynamic failover support when it comes to running virtual machines in production environments. With Heartbeat 2.0 integration, Xen VM failover support will be a part of SLES 10 SP1. I dug a little deeper into the heartbeat integration and currently failover will progress in the order of cluster node names. If a target node does not have the resources to support an additional VM, then the VM will fail over to the next node in the cluster (and repeat the process until it has found a suitable home). Novell engineers are working on better automation for failover, so a VM’s first failover target will be a physical host system that has the capacity to host the VM’s required resources. If you’re planning to build a 2 node Xen failover cluster, then this is really no big deal. However, if you’re planning an 8 node cluster, you’ll definitely want tighter control of the failover process. Still, this has been a big year for Xen, and I would not be surprised if Novell’s Xen failover automation isn’t rock solid by the end of the year.
On my Novell Xen wishlist…
- Migration tools – I would love to have a tool that automatically converts a physical NetWare 6.5 server into a virtual machine. If Novell will not offer a migration tool, I’m sure that a vendor such as PlateSpin would love to jump in and help.
- Improved failover (see above)
- Consolidated backup support – I would love to see an answer to VMware’s VCB. Give us a well-documented backup scripting API and integrating Xen backups into enterprise backup software backup jobs will be a piece of cake.
- Common management APIs/metadata – It would be much easier for all of us (admins, ISVs, etc) if there was a single common management API set for all virtualization platforms. I’m hopeful that a common management API set will be produced as a result Microsoft/Novell partnership. However, getting all of the major virtualization vendors to agree on a common format would open plenty of new doors in terms of more robust backup methodologies, centralized management, and reporting.
I’m sure that time will tell whether or not my wishes are granted…
From the desk of totally unimportant and frivlous items (also known as my inbox) came this timely bit of news in the VMTN Technical Newsletter:
“The new VMTN front page gives a dynamic view into the activity on the site and in the VMware community. Keep up to date on the latest in VMTN News, Virtual Appliances, Technical Resources, Discussions, Knowledge Base, Compatibility Guides, Security Alerts, VMware Blogs, and Virtualization Blogs. The page is updated throughout the day.”
Ok, so it wasn’t technical. It was informative though, and I do like the new layout. I always had a problem with how difficult to navigate the old VMTN site was, how it was hard to go from one place to another without crossing through a third place that I didn’t really care about. Me, I like the forums and the virtual appliances, but I’m also getting to like the community-centered this-hardware-works-on-VI3 section.
And tonight on Friday Night Company-Fights:
The Undercard: AMD vs. Intel, Windows vs. Linux, and Pepsi vs. Coke.
The Main Event: XenSource vs. VMware.
Ok, here’s my beef. I hate all industry wherein marketing rules over substance, and in this case, I’m calling out both XenSource and VMware for being pig-headed and small-minded.
XenSource – you posted test results of a BETA. A product that is, by definition, not ready for prime time. A product that still needs work. That ain’t done. That’s still raw in the center. Can I say it any other way? My constructive criticism is this – wait until you have posted the mature version that is available in it’s production form and then do the proper benchmarking. Don’t get me wrong here, Xen is a great product, but reacting to VMware’s get-your-goat inflammatory benchmarking is rediculous. All XenSource looks like now is another marketing-driven company that is more interested in fighting perceived “Cola Wars” than in putting out a class-A product. Benching a beta just looks cheesy, and worse, sneaky.
VMware – Those were dirty benchmarks and you know it. You didn’t create a proper test between proper versions, under neutral conditions. And your EULA… only when if became obvious that the problem was public did you give XenSource permission to test your product. You need to drop that contingency against publishing benchamrks. It’s sneaky and cheesy too. Yes, you’re not the only ones to do it, but that doesn’t make it right. While you’re at it, why not post meaningful benchmarks instead of trying to raise the heat on Xen. This can only help them, your competition, to get more publicity. And now that it’s out that the benchmarks weren’t fair, if makes VMware look bad.
I adore VMware. I think Xen is great too. I think in this case both of these companies stand to lose credibility, not gain market share.
Recently posted to the VMware web site is this guide to configuring your SAN for maximum virtualization efficiency (wow, that almost sounded like marketspeak… help me Obi-Wan help me!). It’s an excellent resource on both VMware architecture and SANs in general, containing a copious section on what a makes a SAN a SAN. For anyone who doesn’t know, it’s a Storage Area Network – a way to take a big honkin’ system (or systems) with lots of disks and share them to your servers, which will think they are the same as physically attached disks. The guide goes on to discuss different kinds of SANs and how to configure them to work best with VMware’s various utilities. Failover is also discussed, both from the SAN and the VMware side, as are some aspects of optimization for performance. There is also the obligatory mention of NAS support (NFS 3 only) in VI3, a first for the ESX product line (VMWare used to support it in earlier pre-ESX products, the descendents of GSX/Server).
Most of the reason that VMware published this document can be summed up by this quote from page 130:
“Many of the support requests that VMware receives concern performance optimization for specific applications. VMware has found that a majority of the performance problems are self-inflicted, with problems caused by misconfiguration or less-than-optimal configuration settings for the particular mix of virtual machines, post processors, and applications deployed in the environment.”
I have to admit, that had me laughing. It was the whole “blame the user” mentality that I found funny – I’m glad VMware put the paper out there, but really, they had to expect that the 80/20 rule of troubleshooting would apply to them too – 80% of all problems are human error. The guide does a good job of helping avoid those pitfalls, and goes into detail on setting up your SAN to perform well.
After perusing this document a bit, I’m going to stick with my anti-fibre-channel stance by saying that it’s just not worth the trouble to deploy new FC SANs for a VMware deployment. I’d stick with an iSCSI SAN or NFS NAS if you want the full benefit of shared storage and don’t already own FC SAN gear. Now I have to admit tht I’m biased here… I managed a SAN environment at one point in my career, and I hate Fibre Channel SANs with a passion that rivals how the Red Sox fans and Yankees fans feel about one another (except I don’t think EMC SANs hate me… at least not like human hate anyway, and if they did, I’d have to consider checking into an alternative cognitive function facility, aka the nuthouse).
Another reason I stand against rolling out new FC SANs for VMware is this article by SSV’s News Director Alex Barrett, in which EMC VP Charles Hollis calls for NAS as the best choice for VMware environments. I tend to agree, provided that a number of recommendations, also in the VMware SAN guide, are followed. First among these – forget sharing the storage network with anything else other than VMware. In fact, put it on a completely different set of equipment if you can, just to avoid any processor overhead that VLANing with the same network hardware may incur. It’s gotta be gig, too. That’s in the basic VMware VI3 docs, and repeated in the SAN guide.
The optimization hints consist of a mix of technical and non-technical advice, some of which would generally be overlooked by a SAN admin, and some of which would be overlooked by a VMware admin, such as:
“Choose vmxlsilogic instead of vmxbuslogic as the LSI Logic SCSI driver. The LSI Logic driver provides more SAN-friendly operation and handles SCSI errors better. “
“No more than 16 virtual machines or virtual disks should share the same physical volume.”
“Enable write-caching (at the HBA level or at the disk array controller level)”
There are also equally obvious dummy-errors that are mentioned, things that must happen in real life, but for the life of me seem so stupid that only people who WANT to be fired would do them. My favorite:
“Optimize disk rotational latency for best performance by selecting the highest performance disk drive types, such as FC, and use the same disk drive types within the same RAID set. Do not create RAID sets of mixed disk drives or drives with different spindle speeds.”
This is saying the following – Don’t mix 72gb 10k rpm drives within the same RAID array as 72gb 15k rpm drives. And don’t put a 72gb drive in with 144gb drives. And for pete’s sake, if your SAN supports mixed drive types, don’t ever, ever, EVER mix SAS drives and FC drives. Duh.
As for what this document is not – it is NOT a howto guide to configure VMware many applications in a SAN environment, beyond the direct purview of shared storage. There’s no guide to setting up VMware HA/DRS, though there are several pages dedicated to the storage aspects of these products, including how multipathing your HBAs can affect HA and DRS. Thats left for more product-specific papers, presumably because there’s no reason to be redundant.
Overall, the paper gets 8 pokers.
In this blog entry, I passed on system administrators’ complaints about the difficulty in tracking virtual machines in their large companies. The fact that this is a problem surprised me and also surprises others.
For example, on his blog, Tarry Singh asks:
“What kind of a manager are you anyways to not have a track of the machines (Virtual or Physical) in your environment?”
You’re not an unusual manager, it seems. Tarry was responding to an article in The Register titled “How many VMs are on your LAN — and how sure are you?”
Tarry thinks the Register story is a sales pitch to sell yet another auditing software. I don’t agree. I think that virtual machines are so easy to deploy that IT-savvy employees are creating VMs for their departments.
Somewhat in jest, blogger Dirk Elmendorf wrote that with virtualization:
“Now I can set up my own pet network independent from the watchful eye of IT.”
Kenny Scott responds to that blog, saying:
“Clear policies in the workplace are all that is needed to combat workers installing new Windows boxes on virtual instances, because it’s not any harder to install Windows on a virtual instance than it is on an old desktop that you want to use to do a bit of testing.”
Way back in 2006, Gartner analyst Tom Bittman told us that tracking VMs would be a big problem. In that article, he said:
“It’s a different beast with physical servers. Although server sprawl is always hard, at least you can point to a physical server and know it is there. With a VM, it is a lot easier for it to get lost.”
So, it seems surprising that IT managers didn’t anticipate this problem, but, obviously, some capitulated to the demand for VMs and deployed first without planning. I’m sure that a bunch of IT managers didn’t fall into this trap. However, I bet quite a few are grappling with rogue VMs in their organizations.
I’d like to hear from managers who have a solid VM-tracking plan. How did you do it? Got any VM-detecting tips up your sleeve? I’d also be interested in hearing from those who are having problems. Please share your experiences in a comment to this post or via my email, firstname.lastname@example.org.
The editors of SearchDataCenter.com, SearchEnterpriseLinux.com and SearchServerVirtualization.com would like to know more about your server and virtualization decisions in order to better serve your needs. Take this brief survey to help us understand your plans for choosing and implementing various types of servers and how you’ll use virtualization.
Take our survey now and let us know, and you’ll also be entered to win a $200 Amazon.com gift certificate! Simply include your email address at the end, and we will select a winner from among the pool of respondents.
With your help, we’ll find out more about your server decisions, so we can provide information on our sites and at our events that better fits your needs. Thanks in advance for taking the time to help us.