In working with clients between about $800,000 / year and 5 Million dollars per year, companies need to make a change. I started out as more of a Seattle IT Cowboy than a Seattle IT Consultant. I thought IT was about fixing broken computers. As time went by though, eventually the companies grew and there was too much work for a simple cowboy. It became necessary to become a team player. What does an IT team look like? To understand this I wanted to start with the IT Support tactical roles and how they should interact. I like to use the term “healthy conflict.” In this article I wanted to talk about the operational roles and conflict between Incident and Problem management.
The IT tactical roles support the IT strategic roles leveraging the “Healthy Conflict” strategy we’ve described.
Incident – Failures on the network starts with the incident management role. Incident management first makes a determination if this is a known or unknown error. The majority of errors are known errors and are handled without a great deal of problem by the incident management team.
Problem – Problem management focuses on unknown errors. These types of errors require a much more in-depth understanding of the problem. Sometimes requiring log file information that occurred during the error. Sometimes even a re-creation of the error in a test environment of equivalent systems.
The most common “healthy conflict” in IT is between the Incident team and the problem management teams.
The 1st priority of the Incident team is to get the system up and running as quickly as possible. This way less productivity is lost. While the incident is being solved; a user, set of users or even an entire company is down. While the system is down, the company is losing the productivity of the employee and the business systems. The risk though is that by not understanding the problem the same problem may (and probably will) re-occur again and again until the root cause is understood.
For the problem manager, the priority is different. Instead of bringing up the system quickly; the problem management team’s 1st priority is root cause analysis. An incident manager can reboot the server and if the server functions the way it’s expected, the incident manager’s job is done and the ticket is marked resolved. The server may fail again in an hour, a week or next year. The incident manager doesn’t care, once the ticket is resolved the Incident manager moves on to the next problem. The problem manager’s job is to look at the “resolved” ticket and determine why the incident happened. If the problem has a known cause, the ticket is then marked closed. If the ticket is determined to have an unknown cause, the ticket goes into the problem manager’s cue for root cause analysis.
The root-cause analysis process is a long forensic journey looking for hidden clues. The root cause could be a failing piece of hardware, mistakes in the software program, a misconfigured setting or even a mistake in the business process. The analysis review may search through log files, software code, interviews with the incident manager and even with the non-technical employees who discovered the problem.
What I’ve found working the last 21 years as a consultant, is that this healthy conflict is essential for a well-run IT department. When these teams are separate people with opposing objectives, more quality work gets done. Where without this conflict, more problems arise as one or the other team players begin to dominate the leadership of the organization.