when relevant content is
added and updated.
At the risk of sounding like a commercial, Hyperic HQ is my leading-choice for agent-based systems management tools to handle both VMware and non-VMware systems. Personally I tend to prefer non-agent-based systems, but the Hyperic tools work, and work especially well for VMware environments. I like them because I’m an open-source nut – first attracted to it because of the price; I found I had a zero-dollar a day addiction to the LAMPP stack, MySQL, and Dia. Like all GPL junkies, I kept looking for more, and after a few years I found Nagios, then Groundwork, then Hyperic while doing some research for a presentation at Data Center Decisions 2006. I’ve been hooked on them since. I particularly like Hyperic’s rewards program for contributors who find bugs, fix bugs, and make the software better.
From 50,000 feet, Hyperic’s monitoring architecture looks like this:
You may be wondering why I use anything to monitor my VMware environment aside from VirtualCenter – number one is that I do use VC, but I prefer not to use multiple tools. I’ve been in a Fortune 100 company where there were so many 32″ LCD screens on the wall that you didn’t really know what was happening because you were getting so many different results from so many different tools. It was about as useful as having nothing at all except for a user’s phone call to tell you something was down. I have physical and virtual systems that I need to monitor, and until the day comes that my company goes 100% virtualized, I need one tool to monitor them all (please feel free to insert your own LotR joke here).
I’ll bypass the non-VMware material and get to the relevant point – using Hyperic to monitor VMware products – after this brief warning:
Reading the install manual is generally a must – there are several caveats to getting Hyperic fully functional, notably around graphing and charting and deprecated libraries that may need to be installed. Or, you can skip all that by downloading the pre-made Virtual Appliance. If you opt for that option, install the VMware Tools, or else time drift will cause a problem with reporting.
For this run I’m using the prebuilt virtual machine. If you need to install your own server, you need the following:
- 1 GHz or higher Pentium 4, or equivalent (2 x 2.4GHz Pentium Xeon or equivalent recommended)
- 1 GB RAM (4 or more GB recommended)
- 1-5 GB Free Disk Space
On Linux systems, you’ll also need an X server running (or at least the libraries).
To install, you need to run the command setup.sh -full and answer the prompted questions. Overall, it’s a straightforward installation. On a Linux system, execute w/ hq-server.sh start. At another point in the series, I’ll go into using datbases other than the default. You can use Oracle or Postgres, but not MySQL. I’m a big MySQL fan, so I would like to see support for it added later. EnterpriseDB, being a Postgres database engine, is supported.
Now, onto the agent part of the installation… it requires touching the guests, and this can be easily forgotten when you’re of the mindset that you can manage so much through VC. Some preparatory work is needed in order for proper operations on the ESX host. First amongst these is the creation of a user account (hqadmin is the default used by the agent) on the local machine. This account needs to have the admin-level role in ESX.
To install the agent:
(where x = the version number of the agent you’re installing)
You will get some prompts, most of them self-explanatory, about what sort of install you want to perform. I recommend saying yes to secure communications and using port 7443 instead of 7080 as the default port. When you are prompted for the user name, use the account you created earlier.
Configuring ESX3 to report to the HQ server requires some modification of the firewall. It’s easily accomplished with a couple of commands:
esxcfg-firewall –openPort 7443,tcp,out,HypericHQAgent
esxcfg-firewall –openPort 2144,tcp,in,HypericHQAgent
Note that if you selected the default port (7080) when you set up the agent, rather than the SSL port of 7443, you will have to use that port number. Again, I recommend using 7443 for secure communications.
Once the host has the agent installed, you can install agents on the guests (virtual machines) in the same fashion. When these agents are installed, their descriptor the Hyperic management console will indicate to which host they belong.
The VMware-specific monitoring information covers a lot of VM- and Host-specific functions on ESX hosts. The following, taken straight off Hyperic’s documentation, lists them:
Vmware Monitoring Specification
- General Server Metrics (CPU used, Total Memory Used, etc.)
- Memory Available for VMs
- Memory Used by VMs
VMware ESX 2.x and 3.x VM NIC Metrics
- Packets Transmitted
- Packets Transmitted per Minute
- Packets Received
- Packets Received per Minute
- Bytes Transmitted
- Bytes Transmitted per Minute
- Bytes Received
- Bytes Received per Minute
VMware ESX 2.x and 3.x VM Disk Metrics
- Reads per Minute
- Writes per Minute
- Bytes Read
- Bytes Read per Minute
- Bytes Written
- Bytes Written per Minute
VMware ESX 2.x and 3.x VM Metrics
- Process Virtual Memory Size
- Process Resident Memory Size
- Process Page Faults
- Process Page Faults per Minute
- Process Cpu System Time
- Process Cpu System Time per Minute
- Process Cpu User Time
- Process Cpu User Time per Minute
- Process Uptime
- Process Cpu Total Time
- Process Cpu Total Time per Minute
- Process Cpu Usage
- VM Cpu Wait
- VM Cpu Wait per Minute
- VM Cpu Used
- VM Cpu Used per Minute
- VM Cpu Sys
- VM Memory Shares
- VM Memory Minimum
- VM Memory Maximum
- VM Memory Size
- VM Memory Ctl
- VM Memory Swapped
- VM Memory Shared
- VM Memory Active
- VM Memory Overhead
- VM Uptime
Most of these have a default report time of ten minutes, though some of the more critical and/or volatile report every five minutes. Most of the ESX host reporting and all of the VM Disk and NIC reporting are on ten-minute report timers.
This has some unique operational opportunities in managing virtual desktops as well as servers – namely being able to proactively monitor individual workstations and prevent system faults from becoming productivity-impacting problems for users and generating helpdesk tickets on desktops the way it’s done on servers in most enterprises.
That should be enough for now… more in later posts in this series, complete with some screenshots.