Posted by: Eric Siebert
Eric Siebert, stuck, virtual machine, vSphere
Occasionally virtual machines (VMs) get stuck in a zombie state and will not respond to a power-off command using the traditional vSphere client power controls. Rebooting a host will fix this condition — but rebooting is usually not an option. Fortunately, there are a few methods for forcing the VM to shut down without rebooting the host.
I previously documented these methods with VMware Infrastructure 3 (VI3) and wanted to make sure they all worked with vSphere. The methods below are listed in order of usage preference starting with using normal VM commands and ending with a brute force method.
Method 1 – Using the vmware-cmd service console command (the command-line interface equivalent of using the vSphere Client)
- Log in to the ESX service console.
- The vmware-cmd command uses the configuration file name (.vmx) of the VM to specify the VM to perform an operation on. You can type vmware-cmd -l to get a list of all VMs on the host and the path and name of their configuration file. The path uses the Universally Unique Identifier (UUID) or long name of the data store; alternatively, you can use the friendly name instead. If you do not want to type the path when using the vmware-cmd command you can change to the VM’s directory and run the command without the path.
- You can optionally check the power state of the VM by typing vmware-cmd <VM config file path & name> getstate.
- To forcibly shutdown a VM type vmware-cmd <VM config file path & name> stop hard.
- You can check the state again to see if it worked; if it did the state should now be off.
Method 2 – Using the vm-support command to shut down the VM by first finding the virtual machine ID
…and then using the vm-support command to forcibly terminate it. This method does a lot more then shutting down the VM, as it also produces debugging information that you can use to troubleshoot an unresponsive VM.
- Log in to the ESX Service Console.
- The vm-support command is a multi-purpose command that is mainly used to troubleshoot host and VM problems. You can use the -X parameter to forcibly shutdown a VM and also produce a file with debugging information. This command will create a .tgz file in the directory that you run it in and cannot be run from a VMFS volume directory (running it in the /tmp directory is recommended). First type vm-support -x to get a list of the virtual machine IDs (VMID) of your running VMs.
- To forcibly shut down the VM and generate core dumps and log files, type vm-support -X <VMID>. You will receive prompts asking if you want to take a screenshot of the VM. A screenshot can be useful to see if there are any error messages. You will also be prompted to see if you wish to send an NMI and an ABORT to the VM, which can aid in debugging. You must say yes to the ABORT prompt for the VM to be forcibly stopped. Once the process completes, which can take 10-15 minutes, a .tgz file will be created in the directory in which you ran the command that you can also use for troubleshooting purposes. To avoid filling up your file system when the file is created, switch to the /tmp directory when you run the command.
- You can check the state of the VM again either by using the vmware-cmd command or by typing vm-support -x and you should not see the VMID for that VM listed anymore. Be sure and delete the .tgz file that is created when you are done to avoid filling up your host disk.
Method 3 – Using the kill command by first finding the process identifier (PID) of the VM and then using the kill command.
- Log in to the ESX service console.
- The process status (ps) command in Linux shows the currently running processes on a server and the grep command finds the specified text in the output of the ps command. Type ps auxfww | grep <virtualmachinename> to get the process ID (PID) of the VM. You will have two entries returned, one is from the running of the ps command. The longer entry is the running VM process. The longer entry will end in the config file name of the VM and is the one you want to use; the number in the second column of that entry is the PID of the VM.
- The kill command in Linux sends a signal to terminate a process using its ID number. The ‘-9′ parameter forces the process to quit immediately and cannot be ignored like the more graceful ‘-15′ parameter can sometimes be. Type kill -9 <PID> which will forcibly terminate the process for the specified VM.
- You can check the state using the vmware-cmd command to see if it worked; if it did, the state should now be off.
All three of these methods work identically on ESX hosts in both VI3 and vSphere. These methods also work for ESXi, but their execution is a bit different. In a future blog post we will cover how to use these methods with ESXi.