Infrastructure 2.0 Blog

Oct 6 2010   3:26PM GMT

The Latest Victim of Human Error and Network Device Misconfiguration: Microsoft



Posted by: Guest Author
Networking

Today’s guest post by Matt Gowarty.

Last week, Robert McMillan of IDG News wrote about a recent incident where Microsoft suffered critical issues when a human error during a change left Microsoft vulnerable to unwanted consequences. According to Microsoft, “We have completed our investigation and found that two misconfigured network hardware devices in a testing lab were compromised due to human error. Those devices have been removed.” The change(s) that caused the misconfigurations opened up vulnerabilities that led to spam being sent through the Microsoft equipment—quite embarrassing for the company who has been stepping up security the past several years with all of the negative publicity tied to hacks of the popular Internet Explorer Web browser.

The take-away, to this blogger, is pretty loud and clear: Companies—even those of Microsoft’s ilk—are incredibly vulnerable if they’re behind the times if they are relying on spreadsheets and change logs to manage change within the IT department. (Admittedly, not all companies will be the target of malicious attacks on the scale that Microsoft faces, either.)

With the “human error” comment in Microsoft’s statement, it appears this vulnerability was an unintended consequence of an authorized network administrator making a mistake. The statement does not say whether the modification occurred during an approved change process time window or was an unplanned changed made by an authorized user.

The simple truth: With the hundreds or thousands of changes occurring each and every month, it’s extremely difficult to keep up with all of the individual changes, and more importantly, manage the changes and ensure they follow best practices and stay within compliance mandates or gold standard templates.

A huge mistake made by organizations today is assuming a change process is all you need to ensure safe and correct changes throughout your organization. A normal change process would include steps like:

  • A device is determined to need a change for any number of reasons
  • A change request is submitted
  • A person or panel reviews all requests and makes edits or accepts
  • The planned change is placed in a ticketing system and assigned a time during an approved maintenance window
  • The change is implemented and documented
  • The ticket is closed

While the change management process is critical to reduce the risk, it is assuming three major aspects that have huge potential impacts for the organization and the network infrastructure. And we all know what our mothers taught us about assuming.

  1. No change is ever made that doesn’t follow the process
  2. The review person or panel has the expertise to catch every potential suboptimal configuration (such as how a change in one device could potentially impact a network neighbor along a service path)
  3. The actual change will be implemented correctly—i.e. no human error such as “fat fingered” or incorrect copy and paste

In the above Microsoft example, the vulnerability could have been caused by any one of the risks above or any combination of two or more. This is assuming the issue was an inadvertent mistake. Now if the vulnerability was caused during a change but was intended to cause harm, all of the sudden the risk and challenge in finding it grows rapidly because there would be no documentation or the configuration changed on purpose and would be hidden.

After the fact forensics is never the ideal driver of instilling a change management process and policy. In our next post, we’ll detail the best-practices of change management—that need to be implemented before the breach.

Comment on this Post

Leave a comment: