Posted by: Eric Siebert
Eric Siebert, ESXi, stuck VM, VMware, vSphere
At some point, you may need to know how to kill a stuck or frozen VMware vSphere 4.0 ESXi host virtual machine when the traditional power controls do not work. As with VMware ESX, there are several methods, which I covered in a previous post, killing a virtual machine (VM) on a VMware ESX host in vSphere.
The methods for ESXi are very similar to that of ESX, but the execution is different as ESXi doesn’t have a service console like ESX’s. The methods below are listed in order of usage preference, beginning with using normal VM commands and ending with a brute force method.
Method 1: Use the vmware-cmd command in the vSphere command-line interface (CLI)
Note: The vSphere CLI is formerly known as the Remote CLI and is not to be confused with the vSphere PowerCLI. The vSphere CLI is the CLI equivalent of using the vSphere Client. Because ESXi does not have a service console like ESX’s, you need to use the remote vSphere CLI to run the vmware-cmd command with ESXi. The vSphere CLI can be downloaded and installed on any Linux or Windows system and can be used to run specific commands remotely on any ESX/ESXi host, and consists of a collection of Perl scripts for each specific ESX/ESXi command. To use this method, follow the steps below.
- Run the vSphere CLI on the system that you installed it on. You’ll need to switch to the \bin subdirectory where the Perl scripts are located to run the commands.
- The vmware-cmd command uses the configuration file name (.vmx) of the VM to specify the VM on which it’s going to perform an operation. You can type vmware-cmd.pl -H <ESXi host name> -l to get a list of all VMs on the host and the path and name of their configuration file. The path uses the Universally Unique Identifier (UUID) or long name of the data store; alternatively, you can use the friendly name instead. You’ll be prompted for a log in to the ESXi host before the command will execute. Here you have the option to specify a vCenter Server with -H and you use -T to specify the ESXi host that the vCenter Server manages. Note: You can avoid entering log-in information every time you run a command by using a configuration file or Windows authentication passthrough using Security Support Provider Interface (SSPI). See the vSphere Command-Line Interface
Installation and Reference Guide documentation for more info.
- You can optionally check the power state of the VM by typing vmware-cmd.pl -H <ESXi host name> <VM config file path & name> getstate.
- To forcibly shut down a VM, type vmware-cmd.pl -H <ESXi host name> <VM config file path & name> stop hard.
- You can check the state again to see if it worked; if it did the state should now be off.
Method 2: Use the vm-support command to shut down the VM
When you use the vm-support command to shut down a VM, you must first find the virtual machine ID (VMID) and then use the vm-support command to forcibly terminate it. This method does more then shut down the VM – it also produces debug information that you can use to troubleshoot an unresponsive VM. On ESXi hosts the vm-support command can be using the special tech support mode which provides access to its Busybox, Posix-based management console.
- On the ESXi console, press Alt-F1.
- Type the word unsupported (text will not be displayed while typing) and press Enter. A password prompt will appear, enter the root password for the ESXi host and you will be at a # prompt in the root partition.
- The vm-support command is a multi-purpose command that is mainly used to troubleshoot host and VM problems. You can use the -X parameter to forcibly shut down a VM and also produce a file with debug information. As with ESX hosts, running this command will create a .tgz file but it will not be located in the directory that you run the command in. Instead it will be created in the /var/tmp directory which points to the 4 GB Virtual File Allocation Table (VFAT) system swap partition. You can also set a Virtual Machine File System (VMFS) volume as your working directory for the .tgz file. First, type vm-support -x to get a list of VMIDs of your running VMs.
- To forcibly shut down the VM and generate core dumps and log files, type vm-support -X <VMID>. If you wish to specify an alternate directory for the .tgz file that is created also add the -w <vmfs volume path> parameter. You will receive prompts asking if you want to take a screenshot of the VM. This can be useful if you want to see if there are any error messages. You will also be prompted about whether you wish to send an non-maskable interrupt (NMI) and an ABORT to the VM, which can further aid in debugging. You must say yes to the ABORT prompt for the VM to be forcibly stopped. Once the process completes, which can take 10-15 minutes, a .tgz file will be created in the /var/tmp directory that you can use for troubleshooting purposes.
- You can check the state of the VM again by typing vm-support -x. You should not see the VM listed at this point. Be sure and delete the .tgz file that is created when you are done to avoid filling up your host disk.
- You can leave tech support mode by typing ‘exit’ and pressing Alt-F2 to return to the normal console mode.
Method 3: Find the VM’s process identifier and forcibly terminate it
This method also relies on using the tech support mode console that is used in method 2 to run the commands.
- On the ESXi console, press Alt-F1.
- Type the word unsupported (text will not be displayed while typing) and press Enter. A password prompt will appear. Enter the root password for the ESXi host and you will be at a # prompt in the root partition.
- The process status (ps) command shows the currently-running processes on a server, and the grep command finds the specified text in the output of the ps command. Type ps -g | grep <virtualmachinename> which will return the WID (first column), CID (second column) and process group ID (PGID) (fourth column) of the running processes of the VM. You will have several entries returned; the number in the fourth column of the entries is the PGID of the VM.
- The kill command sends a signal to terminate a process using its ID number. The ‘-9′ parameter forces the process to quit immediately and cannot be ignored like the more graceful ‘-15′ parameter can sometimes be. Type kill -9 <PGID> which will forcibly terminate the process for the specified VM.
- You can check the state of the VM again by typing vm-support -x; you should no longer see the VM listed.
- You can leave tech support mode by typing ‘exit’ and press Alt-F2 to return to the normal console mode.
All three of these methods work identically on ESXi hosts in both VMware Infrasture 3 and vSphere.