The Virtualization Room

Aug 12 2008   11:46AM GMT

Major ESX bug plagues thousands of VMware customers

Eric Siebert Eric Siebert Profile: Eric Siebert

A bug in the latest versions of both VMware ESX and ESXi (3.5 Update 2) has effected many of VMware’s customers — and VMware is asking its users to wait 36 hours for a patch.

As the date changed to August 12, 2008, customers were finding out that they could no longer start virtual machines on there ESX hosts or vMotion them to other hosts.

A post was made to the VMware Technology Network (VMTN) community about this bug to which many customers responded that they were experiencing the same problem and had spent hours trying to figure out what was wrong. The problem was not immediately obvious to most because the error that was being displayed was that a general system error has occurred, the actual error that could be found by going through the virtual machine log files was that the product had expired. Many users contacted support, who eventually figured out they had a major issue on their hands.

Currently, the only workaround for this is to set the host clock back and to restart virtual machines; however, this workaround is not acceptable for many customers who rely on accurate time for their systems and applications as well as to satisfy compliance regulations. Virtual machines that are already running are not effected by this bug unless they are rebooted or powered off and back on.

The bug appears to have been code that was left in the beta version of ESX to stop working on a specific date after the beta had ended. This is commonly done by software vendors and is known as “time bombing”: software stops working past a certain date and users are forced to use the latest gold version instead of continuing to use the beta version.

VMware has published a knowledgebase article on this issue and promises to release a fix within 36 hours. For most customers this is not enough, having to wait 36 hours is much too long for a problem of this magnitude. They are looking for an immediate fix to the problem so they can apply it to their effected hosts. Additionally there is concern about how the fix will be delivered, presumably it will be released as a new build of ESX which will require ESX hosts to be offline as it is installed and they are re-booted.

Many customers posting to the VMTN thread have expressed anger and frustration at VMware for this. To make matters worse and further frustrate users, VMware’s knowledgebase went offline shortly after the document was published presumably because it could not handle the extraordinary amount of requests.

It is hard to believe a company the size of VMware could allow this to happen. Something like this could not be picked up in beta testing and is not necessarily a bug but negligence on VMware’s part by not removing or disabling this code before it was released as the gold version. Most software companies have strict processes for developing, testing and performing quality assurance before releasing a new build. How something like this could happen is anyone’s guess right now but it appears that either processes do not exist or they were simply not followed.

In the meantime, customers continue to wait for VMware to release a fix for this. Because of the severity and the effect on so many customers there will most likely be some type of fallout at VMware over this. Something needs to be done for VMware to assure customers that they are taking this very seriously and are committed to doing everything possible to ensure that this never happens again. With Hyper-V now a viable alternative, VMware can’t afford major mistakes like this.

4  Comments on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • Eric Siebert
    Shocking. VMware competitors could do no better with industrial espionage. Unless they just did.
    0 pointsBadges:
    report
  • Rick Vanover
    I am very curious what the resolution how the resolution will work for existing installations of ESX 3.5 U2 / VC 2.5 U2. Meaning - will they be able to correct it so that they can immediately enable DRS and HA - or will the shutdown of the VMs be required?
    0 pointsBadges:
    report
  • Eric Siebert
    Patch Available: http://kb.vmware.com/selfservice/dynamickc.do?externalId=1006721&sliceId=1&command=show&forward=nonthreadedKC&kcId=1006721
    0 pointsBadges:
    report
  • Eric Siebert
    Hi Rick, VMware was exploring various ways for deploying the patch without shutting down VMs on the host but was never able to come up with one. In order to install the patch you need to shutdown or vMotion the VMs to another host, put the host in maintenance mode and install the express patch that was released. There is sort of a workaround to get vMotion to work by changing the clock on one host that I mentioned in the below thread. http://itknowledgeexchange.techtarget.com/virtualization-pro/vmware-releases-emergency-patch-for-esx-35-update-2-bug/
    0 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: