If a problem is the unknown cause of one or more incidents
then how can I design a repeatable model for something that is unknown?
The purpose of Problem Management is to manage the problems
throughout their lifecycle. Problem Management seeks to not only to minimize
the adverse effect of incidents by providing work arounds, but also seeks to
eliminate outages, and prevent them from recurring again.
In Incident Management ITIL defines an Incident Model as a
predefined set of procedures based on type of incident. So then what is a “Problem Model”?
Problem Models
Not all problems are the same. There are many different types of problems
and each type will require unique roles and responsibilities, varied skill sets
and different timelines and policies based on the complexity of the problem. When considering how to design problem models
consider the workflow required once the “problem” or is identified.
Approach to Defining Problem Models
One approach is to classify the types of problems that occur
within your organization. Because
“Problem Management” requires both reactive and proactive activities most
service providers could start with those two categories when considering models
for “Problem Management”.
Another approach is to consider “How are problems
identified”? Depending on where in the
lifecycle of Strategy, Design, or Transition or Operation the problem is
identified, a different “Problem Model” may be required. Here are a few examples:
Lifecycle
Problems
Example 1: Design Problem Identified: Many incidents occurred because of the dynamic changes that occurred throughout the app/dev environment the production environment were not coordinated with transition and service support teams early on in the lifecycle.
Example 2: A glitch was found in software during test of major change. Decision is made to move forward with the change based on business impact that the delay will cause. How will this known error get communicated and documented in problem management? Who, what, where ,when and how will problem management reduce the impact of incidents and prevent this from recurring? A problem model along with procedures and policies is required.
The people, skillsets, methods, and techniques utilized to
resolve these types of problems will require a different set of procedures or
model than operational type problems such as:
Problem Models involving
Infrastructure outages or vendors:
Example 3: Problem was identified
as bad switch. The switch was swapped
out but problem management would need a procedure with clearly defined roles
and responsibilities for how to prevent this in the future. While investigating problem management team
has uncovered that a vendor has confirmed malfunctioning ports on switches that
were released into production.
Also, an operational break/fix or problem models for defects
will differ from an all hands on deck Major Problem Model.
Therefore, you will have one problem management process but
will need many problem management models to process the varied types of
problems that run through that process.
Begin with a few generic models that will evolve over time. Unite and
coordinate these with other models as required for integration with incident
and change management processes.
Efficient problem models will result in higher availability
of services, increased productivity (less chaos), reduced expenditure and
reduction in the cost of firefighting and resolving incidents. All of that is
nominal in comparison to the customer and business confidence that is gained by
the service provider.
Comments