Today’s guest post from Matt Gowarty.
Yesterday, I discussed Microsoft’s very public and damaging experience of “human error” in routine network change. But the real challenge for Microsoft and virtually every other organization is the sheer number and complexity of managing the number of changes and the configurations across the entire network. If every organization had unlimited staff, resources and time, it could be done manually, but as we all know, no organization has that luxury. So organizations must enhance their existing change processes with more automation and intelligence to reduce the risk of vulnerabilities which just hammered Microsoft.
There are several best practices to help organization manage change better and reduce the risk of human error. The best practices include:
- Have a change process and follow it
- Use a “trust but verify model”—you trust everyone follows the process, but verify by tracking every change on the network, both planned and unplanned
- Implement your best practices/standards/compliance policies but not just during
installation, use ongoing management to detect violations
- Proactively monitor change and configuration 24×7 to find problems and hard to find issues
As I talk to organizations across the world, virtually every IT organization has the first best practice implemented with a change process. But then the day to day use of the next three drops radically. Many organizations assume no changes occur outside of maintenance window or everything is documented perfectly, but in reality, we all know their our outliers all of the time. Again, virtually every organization has best practices or gold standards, but only think about them during the initial install. They are too busy to go out and look device by device to find any violations and “configuration creep” is bound to cause inconsistency. And finally, the proactive management has the lowest percentage today for more IT organizations because they are so busy doing everything else, they wait for a problem to occur and then go into the troubleshooting and firefighting mode.
The good news is these challenges can be addressed through automation, intelligence and control solutions. NetMRI is a leading solution for helping organization take more control and automate network configuration, change and compliance management across the entire network. Automation, control and intelligence can help with aspects such as:
- Limiting human error by automating changes
- Detecting planned and unplanned modifications by identifying every change
- Ensuring changes do not violate compliance standards or internal best practices
- Leveraging intelligence to find suboptimal configurations before end users are impacted or vulnerabilities exploited
While in an ideal world, it’s impossible to eliminate every unintended consequence that could damage an enterprise, the potential risk and exposure can be greatly reduced through managing change, configuration and compliance better. Organizations spend huge sums on redundancy and back up plans, but over and over again, the vast majority of issues are caused by change. Organizations should step up more and start to invest the time, people and resources to eliminate then number one cause of issues today.
Who knows, if the Microsoft staff saw an unplanned change right away or received an alert of a suboptimal configuration or identified a security best practice was violated, could this problem have been fixed well before a hacker found a way in? No one knows for sure, but a betting man would have just eliminated three potential risks in a matter of minutes.