Recently someone asked me what VMware scheduling is, so I thought I would cover that in this blog post. Scheduling, or virtual CPU scheduling, happens behind the scenes and is not a very visible component of virtualization, but is absolutely critical for virtualization to work properly. You should have at least a basic understanding of how it works so you understand how it impacts virtual machine (VM) performance and what to look for when troubleshooting malfunctions.
The scheduler is a component of the VMkernel that schedules requests for the virtual CPUs assigned to virtual machines to the physical CPUs of the host server. Whenever a virtual machine (VM) uses its virtual CPU, the VMkernel has to find a free physical CPU (or core) for the VM to use. On a typical host server, the number of virtual CPUs usually outnumbers the number of physical CPUs, so the VMs are all competing to use the limited number of physical CPUs that the host has. The scheduler’s job is to find CPU time for all the VMs that are requesting it and to do it in a balanced way, so performance for any one VM does not suffer. This is not always an easy task, especially when VMs are assigned multiple virtual CPUs (virtual symmetric multiprocessing, or vSMP) as this further complicates the scheduling.
To put scheduling in simple terms, think of the scheduler as an air traffic controller that has to handle the many requests for incoming planes (VMs) to land on the limited amount of available runways (CPUs). It’s a delicate balancing act to make sure that all the planes are landing and that they do not sit in the air too long waiting for an available runway. To further complicate the matter larger planes (vSMP VMs) need special runways to land on which makes the air traffic controller job more difficult. If you’ve ever played the iPhone game Flight Control you would know how difficult this is.
Scheduling CPU time for single CPU VMs is much easier for the scheduler as it only has to find one available physical CPU for the VM to use. As mentioned, multiple CPU VMs are more difficult to schedule as the scheduler must find simultaneous multiple physical CPUs for the VM to use. This is called co-scheduling, which is a technique for scheduling related process to run on different processors concurrently. If a VM is assigned multiple processors, the VMkernel needs to fool the operating system into thinking it has multiple processors; co-scheduling is critical for this to take place.
There have been different methods for co-scheduling implemented in different versions of ESX. ESX 2.x used a strict co-scheduler, so a VM with two vCPUs had to have two physical CPUs available simultaneously for the VM to have CPU time; if two physical CPUs were not available, the VM would have to wait until the scheduler found two free to be able to schedule CPU time for the VM.
Beginning with ESX 3.x, a relaxed co-scheduler was implemented so only vCPUs whose scheduling was falling behind (skewed) were co-scheduled and the others were not. By doing this, scheduling becomes easier and it improves overall processor utilization. With vSphere, VMware further improved the relaxed co-scheduling algorithm so the scheduler has more choices when scheduling vCPUs, which will further improve utilization and performance.
Because scheduling is very important to VM performance, you should avoid using CPU affinity which constrains the scheduler and makes it more difficult to schedule CPU time for VMs. CPU affinity can be configured for individual VMs to force them to only run on specific host physical CPU’s and should not be used unless you have a specific need for it. Additionally, setting CPU shares on VMs can cause the scheduler to give higher priority to those VMs with higher share values and lower priority to those VMs with lower share values.
While the scheduler does its best to evenly schedule CPU time for VMs, it can sometimes fall behind on very busy systems which results in degraded VM performance. How long a VM is waiting for an available CPU is measured in a statistic called Ready Time which indicates the amount of time a VM is waiting for a CPU to become available. This can be measured in the command-line utility esxtop as a percentage (%RDY) or in vCenter Server as a time unit.