I recently floated the idea of implementing a mix of virtualization products in your data center as a way to better customize your virtual environments. The query was part of a larger discussion I wanted to get going about how VMware will compete when Hyper-V is generally released. I threw out the notion that data center managers might use, for example, Hyper-V for end-user file servers; VMware ESX for apps that require dynamic load balancing, sophisticated disaster recovery and migration; and Xen for commodity Linux boxes.
The idea behind that supposition was to match your enterprise investments to appropriate workloads because, let’s face it, running everything on ESX is going to be expensive compared to other options. Big deal if you don’t get ESX-level features because you may get enterprise level features on silver-medal products.
Now, I wish I could take credit for that idea (for better or for worse), but that really came from SearchServerVirtualization.com‘s editor, Jan Stafford. At any rate, a few people had comments about that note and I thought it would be fitting to continue that conversation with their input:
One of the reasons I think Hyper-V is interesting is the server core concept. The ability to create appliance virtual machines for specific roles such as DNS, DHCP, DC etc. means you can drastically reduce the manageability overhead and the attack surface for those servers. Weird – I’m finding myself suggesting that an all-MS platform may actually be more secure than the alternatives!
On the other hand, you may be correct that fully-fledged servers with complex HA needs will be better off on ESX.
There are flaws to this query, the base of which stem from whether ESX, Hyper-V and Xen-based virtualization products can even be compared; never mind that there are disparities between Xen-based virtualization vendors (xVM isn’t Red Hat isn’t Citrix Xen etc). This was the point that some folks took issue with:
Just a comment about an “apples and oranges” comparison you made:
I understand the point you were trying to make about using the right virtualization product for the right job, and at times this might mean using multiple solutions, but you can’t really compare Hyper-V or VMware to Xen. Both Virtual Iron and Citrix offer a virtualization solution based on Xen technology, but neither sells just Xen. Your reference to Xen brings to mind what a Suse or Red Hat shop might do with Xen technology, but not those that would consider Hyper-V or VMware as a virtualization solution.
So, there are still many problems that have to be approached before we even ask the question of implementing a mixed enterprise virtualization infrastructure. Ultimately, people will have to decide on what functionality, support and management interfaces are important. Is quick migration support good enough, or do you need live migration? How does each platform approach P2V conversions? Does VirtualCenter have the right kind of management options for your installation? Will you be better off using System Center Virtual Machine Manager?
Let’s keep prodding at this idea and brainstorm ways that using multiple virtualization products in one environment could work. Send us your comments and feedback. If you’ve tried the mixed virtualization environment, we’d love to hear about it.
Like many administrators, I have been quite happy with the performance of my ESX environment. However, we recently had an observation that avoided a potentially disastrous issue.
In my environment, I am using an IBM SAN Volume Controller (SVC) for storage with 4 GB/s host bus adapters (HBA). The driver is proprietary to ESX for connectivity. I am currently running version 3.02, but I recently came across some unexpected behavior.
ESX was only using one of the HBAs for the SAN storage. We wanted to determine what would happen if that path was lost. Would ESX would continue operating as expected? So we performed the following tests and came up with these results:
-Dropped connectivity on first HBA / active path rolled to next HBA, port status ‘dead’
-Restored connectivity on first HBA / port went to ‘on’, active path remained on second HBA
-Dropped connectivity on second HBA / lost all connectivity
Yes, all connectivity was lost in the third step. It was better to correct this now before learning this the hard way. My expectation was that the connectivity would use both HBAs at all times and failover as needed. Luckily, this is easily corrected.
There are two ways to address this issue. One is to run a command to instruct ESX to use both HBAs, and the other is to apply an update. The first option would use the following command:
esxcfg-mpath --policy=rr --lun=vmhba1:0:1
This command would be run per LUN per ESX host, but is a very quick and easy way to address the functionality issue immediately and can be done outside of maintenance mode. This behavior is spelled out in VMware KB article 1003270 online. The solution is to install a critical-class patch to the ESX system to address this, as well as a few other issues. The native behavior in ESX 3.5 has this issue corrected with no updates or commands.
One simple way to see if your ESX host is using the different HBA’s is to look at the LUN properties. From the VMware Infrastructure Client, select an ESX host, select the Configuration tab, select Storage, right-click on a LUN, select Properties, click the Manage Paths button, and look at the path listed as Active. If it is always on the first path and that ESX host has a virtual machine running on that LUN, the host may not be using both paths. Below is a figure showing a LUN that is using the second HBA:
You can also run the following command to see who is active at that moment:
The far-right column has the role of active and preferred assigned to a path within a LUN. If the active designation never leaves one path where there is an active virtual machine, you may be at risk of the behavior we observed initially. This makes a case for ESX host patching, as well as ensuring that all redundant components function as expected during installation.
Depending on the scope of your virtual environment, it is likely that physical-to-virtual (P2V) conversions have taken place. The P2V process truly enables VMware administrators to put physical systems into virtual environments. However, you may have come across a system that for some reason will not go through the normal conversion. In such cases the VMware Converter bootable CD may be an option. It provides a zero-transaction state that may be a favorable environment to perform P2V conversions.
Good candidates for using the VMware Converter bootable CD include:
- Systems that run a database engine,
- have real-time systems that may not convert correctly,
- or systems where the VMware Converter agent otherwise fails.
The bootable CD is licensed to enterprise customers, so the download requires advance purchase. The VMware Converter bootable CD is a Windows XP Pre Installation (PE) environment. The initial screen loads as follows:
The behavior is very similar to that of the full installation version once the VMware Converter interface loads. The only difference is that you can only convert the local system instead of being able to convert a remote system. This is to be expected, as the bootable environment should only be used when the traditional mechanisms fail. Once in the application, you can push the conversion to a VMware ESX server or to a flat .vmdk file for use in VMware Server or VMware Workstation:
I had a chance to use the VMware Converter bootable CD for a Windows 2000 system conversion that would not complete correctly in the installed, online environment. The bootable environment is also referred to as a cold clone environment, and with no transactions occurring on the file system a clean backup environment is available. The unfortunate circumstance is that this functionality can transport a poorly configured system to your virtual environment – so you may be able to keep it and its issues running forever.
This PDF has been a long time coming. It is a a document that lists just about every maximum value to do with VI3 possible. For example, how many iSCSI HBAs can you install on one ESX host? This document covers about 80% of the questions I receive on VMware a week, “How much/many of X does ESX/VirtualCenter support?” This document has the answers to your questions!
Kudos to the team at VMware that put this together!
My Digg reader kicked out a great RedmondMag.com interview with VMware’s “product guru” Raghu Raghuram today in which he discusses the company’s product philosophy and how it translates to the VMware product line. During the interview, Raghuram says that there is a “stark difference” between Microsoft’s and VMware’s approach to virtualization. He had this to say about how he positions VMware ESX Server against Hyper-V:
Our view is that the core virtualization layer belongs in the hardware. It also has to be much smaller in order to reduce its surface area for attacks. This is why we introduced the 3i architecture . . . The Microsoft approach is to have virtualization be an adjunct to the OS . . . With the Hyper-V architecture, they’re still maintaining the same dependency on the OS.”
VMware ESX and Hyper-V are both bare metal virtualization products. To belabor an explanation, this means that they both sit in a thin OS layer abstracted from the hardware. This veritably eliminates hardware dependencies. However, Raghuram seems to be suggesting that Hyper-V is more of a hosted virtualization approach. This could be a misunderstanding on his part, questionable editing, or just a case of Microsoft being Microsoft.
At any rate, one difference that is certain can be logically approached when considering ESX versus Hyper-V. Something that virtualization expert Andrew Kutz said at a recent virtualization seminar keeps flashing in my mind. In his (and others’) view, Hyper-V will be the virtualization vendor to beat. This isn’t because Hyper-V is particularly a better product, but because VMware can’t compete with Microsoft on the level of supporting applications and interoperability.
In other words, after years of development and being the big guy in the computing space, Microsoft has a support cloud of applications and services all designed to work together that VMware will need to emulate in order to remain the leader in enterprise virtualization. This remains to be seen. But especially considering the low Hyper-V price tag ($28), VMware must be prepared to counter, at least with lower pricing.
Many virtualization analysts punt on the issue of security. But two recent events have brought security into higher relief: the uncovering of VMware’s file-sharing security flaw and VMware’s announcement of VMsafe, a virtual appliance that adds a layer of security to apps running on virtual machines. While VMsafe attempts to address VMware’s file-sharing problems, the flaw has raised questions about VMware security and the security of virtualization technologies in general.
In a recent SearchSecurity.com article, one interviewee said that after testing virtualization, he determined that putting virtualization into production would require reworking tried-and-true centralized security controls. Another interviewee expressed concerns about future problems, particularly a breach involving the hypervisor.
But we aren’t the only ones asking questions about security for virtual environments. At Rational Suvivability, author Christopher Hoff takes a different angle:
Virtualization up until now has quietly marked a tipping point where we see the disruption stretch security architectures and technologies to their breaking point and in many cases make much of our invested security portfolio redundant and irrelevant.
Has virtualization brought a whole new set of security requirements? Has your company explored or purchased virtualization-specific security software? Share your security-in-virtual-environments experience, and we’ll send you a $10 Starbucks gift card. Email me at firstname.lastname@example.org.
Last week, I wrote about about the database index defragmentation. I discussed reading the scan index and that if the percentage is poor, a defragmentation or index rebuild would be in line. There is more to the database and VirtualCenter 2.5 upgrade story, as I have found out, regarding special permissions. Today I’ll share those my findings with anyone else who may have upgraded to VirtualCenter 2.5 quickly.
The good news is that there is now a VMware KB article about this topic in the known issues section of the release notes. When I upgraded in December of 2007, this was not available. My issue was that although I had correct permissions with the username and password in SQL authentication to the VirtualCenter database, this account did not have the correct permissions to create the SQL jobs. VirtualCenter 2.5 creates three default SQL Agent jobs to manage statistics:
-Past Day stats rollup
-Past Week stats rollup
-Past Month stats rollup
These jobs move data from the VPX_HIST_STAT1 system to the VPX_HIST_STAT2, VPX_HIST_STAT3, VPX_HIST_STAT4 and eventually out of the system. If you upgraded your VirtualCenter with an account that did not have the ability to create these jobs, they likely are not running. The easy indicator is the VPX_HIST_STAT1 table will have millions of records, and the other VPX_HIST_STATx tables will have no records.
I had to call VMware support to confirm this, and once we noticed that the jobs were not present the solution was clear. However, the job took a long time to catch up on the statistics management.
The unfortunate situation is that the VirtualCenter install does not give an error if these jobs cannot be created.
I started to call this post “Attack of the Clones”, but I was a bit worried that George Lucas might get upset at my use of his movie title. So, while sticking to the Star Wars theme, I settled on the current title.
The idea of leveraging the cloning functionality that’s present in many storage arrays on the market today is not a new or unusual one. In particular, on my personal blog I wrote a number of articles about the processes around using clones, the advantages of using clones, and some of the disadvantages of using clones. While my articles were primarily focused around storage systems from Network Appliance (now just NetApp), the basic principles are very similar for other storage arrays as well.
In How to Provision VMs Using NetApp FlexClones, I discussed the processes and procedures around the use of hardware-based clones. In particular, new functionality within ESX Server 3.x required administrators to enable resignaturing in order to see cloned VMFS datastores and be able to use the virtual machines stored in those datastores. At the time, there was no automated way of registering the VMs stored in the cloned datastores as well, and I believe that is still true even today.
Having discussed the “how” of using clones, I moved on to a couple of articles discussing the advantages and disadvantages of using storage system clones. In NetApp FlexClones with VMware, Part I, I presented the advantages of using clones:
- Reduced storage usage
- Reduced time to create VMs
In NetApp FlexClones with VMware, Part 2, I moved on to some of the disadvantages of using storage system-based clones:
- Scalability issues with regards to a maximum number of LUNs supported within VirtualCenter
- Lack of integration with VMware’s graphical tools
- Potential blurring of responsibilities across functional IT teams
Clearly, it’s up to each customer to determine whether the advantages and disadvantages are truly applicable to their organization. For some organizations, the “potential blurring of responsibilities” may be a non-issue, and the reduced storage requirements are a major issue.
Of course, it’s also possible to use hardware-based clones for purposes other than just provisioning. In VM File-Level Recovery with NetApp Snapshots and Full VM Recovery with NetApp Snapshots, I discussed ways to leverage NetApp FlexClones and LUN clones–which are based on Snapshots–to facilitate VM recovery scenarios. So this technology can be used for a variety of purposes in VMware environments.
If you’re a reader whose using hardware-based clones or snapshots — not necessarily Network Appliance-based, but from any hardware vendor — are you using this functionality? How are you leveraging it in your environment? I’d love to hear about the ways in which you are putting these kinds of technologies to work in the real world.
I, like many virtualization administrators, have worked very hard to get the VMware virtual environment set up and running as expected. Now, one of my main tasks is to make sure that we do not do anything to adversely effect server performance. A good place to start in this regard is the VirtualCenter (VC) database. That being said, the VC database is critically important to a successful ESX implementation, so do not do anything that is not advised by VMware documentation or support services. Let’s discuss index defragmentation in particular here when using Microsoft SQL server 2000 for the VC database.
Index defragmentation on statistics
I will save you some work in what to look for in determining which tables will need index defragmentation – statistics. While we all like the statistics and graphing options available in the VMware Infrastructure Client and virtual appliances that may use the table, there can be a great amount of data in that table and it can quickly become fragmented. A fragmented index in a database is similar to a fragmented file system where the ordering of an index is not in the order of the index.
In my VC 2.5 environment, the VPX_HIST_STAT1 table is the heavy hitter. For this database maintenance, I’m going to start with the white paper entitled “VirtualCenter Database Maintenance” available from the VMware website. Here there is a command to check your current fragmentation level:
DBCC SHOWCONTIG (VPX_HIST_STAT1)
I have modified the command to use the table name, as the white paper is VC 2.0 based on the table name, whereas this example is on VC 2.5. The result will look something like the following:
DBCC SHOWCONTIG scanning 'VPX_HIST_STAT1' table...
Table: 'VPX_HIST_STAT1' (800721905); index ID: 1, database ID: 16
TABLE level scan performed.
- Pages Scanned................................: 505458
- Extents Scanned..............................: 78097
- Extent Switches..............................: 457307
- Avg. Pages per Extent........................: 7.4
- Scan Density [Best Count:Actual Count].......: 22.34% [113183:905308]
- Logical Scan Fragmentation ..................: 3.81%
- Extent Scan Fragmentation ...................: 0.47%
- Avg. Bytes Free per Page.....................: 187.2
- Avg. Page Density (full).....................: 97.69%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
The key takeaway is the Scan Density percentage. The white paper advises that a number close to 100% is good, meaning that the table index in the example above is quite fragmented. The white paper goes on to identify two correction levels for improving the index. Index defragmentation and, more aggressively, rebuild are the standard options to address the index. If the scan density after a index defragmentation does not do enough to improve the index, database admins will have to begin the rebuild process. A caveat: rebuilding requires VC downtime to perform the database maintenance.
By comparison, here the same command on the VPX_EVENT table. This table is busy, but not near as much as the statistics table:
DBCC SHOWCONTIG scanning ‘VPX_EVENT’ table…
Table: ‘VPX_EVENT’ (36195179); index ID: 1, database ID: 16
TABLE level scan performed.
– Pages Scanned…………………………..: 594
– Extents Scanned…………………………: 86
– Extent Switches…………………………: 132
– Avg. Pages per Extent……………………: 6.9
– Scan Density [Best Count:Actual Count]…….: 66.39% [75:133]
– Logical Scan Fragmentation ………………: 6.73%
– Extent Scan Fragmentation ……………….: 98.84%
– Avg. Bytes Free per Page…………………: 150.5
– Avg. Page Density (full)…………………: 98.14%
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
This table is in better shape, but is much smaller than the statistics table.
Configure statistics logging level
As virtualization administrators, one safeguard we can perform is to limit the logging levels within the VMware Infrastructure Client. To access the logging levels select Administration menu, then VirtualCenter Management Server Configuration, then Statistics. Here you want to limit the number of high level logging to keep the VPX_HIST_STATx tables in check:
In selecting which level works best for your environment, be sure to identify any monitoring tools or virtual appliances that may read the selected tables. Also be sure to benchmark your database size and index fragmentation to see if you gain any improvements. Identifying the parts of the entire VMware Infrastructure environment that you can keep in maintenance mode will make your job as a virtualization administrator much easier.
I had a peculiar situation come up recently in my VirtualCenter 2.5 environment with ESX 3.02 hosts in this particular environment. I posted the situation on the VMware Communities forum, but have yet to get a clear answer on what happened. Here is my account of what happened so you can either help me out or at least protect yourself against this same situation:
-Server maintenance for a separate task was being performed on all of my ESX hosts (seven systems). One system was put into maintenance mode at a time.
-Once I had system #2 out of maintenance mode, I cloned a particular guest virtual machine. This clone was put locally as it was intended only as a test/failback guest operating system.
-After the clone was created, VirtualCenter migrated the virtual machine from one host to another. (Remember this virtual machine has local storage now)
-After that migration, the virtual machine could not power on because the virtual disk did not migrate with the virtual machine. The virtual disk was also removed from the inventory of the guest machine.
Once this occurred, I looked for the virtual disk file, and was intending to add it back to the inventory. Further, if I needed to move the virtual machine back to the original host, that was an option as well that did not correct the situation. I could not find the virtual disk anywhere on any of the shared storage resources or local storage resources. I ran the following SQL query on the Virtual Center database:
from VPX_EVENT where
CREATE_TIME > '2008-3-3 00:45:00.000' and
CREATE_TIME CREATE_TIME < '2008-3-3 02:00:00.000'
It goes without saying that you should not proceed into the SQL database without the proper cautions. This query takes a look into the events leading up to the error that occurred when I was attempting to power on the virtual machine. Here are the result that let to the demise of my virtual machine that I placed in a spreadsheet:
I have stripped out the entries that were not pertinent to this virtual machine, and you can see that the vim.event.VmBeingRelocatedEvent occurred after my cloning, and remember this guest had its virtual disk on the local storage of esx-2.intra.net after I cloned it from esx-4.intra.net. Note that the first two yellow entries do not have my admin username being used – thus implying VirtualCenter or DRS is the culprit! You may wonder about the timing of the last three events, there are about three seconds between the vim.event.VmRelocatedEvent and vim.event.VmStartingEvent entries. The scrolling log of the VMware infrastructure client would have completed migration by the vim.event.VmRelocated message on the migration before the virtual power was applied.
Has anyone had this scenario in your virtual environments? As a short term measure I am not putting anything important (even backups as in this case) on local storage. When I get to the bottom of this behavior, I will surely follow up here with what happened.