Skip to main content


Showing posts from May, 2014

Incidents and Problems

  An incident is an unplanned interruption to an IT service or reduction in the quality of an IT service and is strictly a reactive process. A problem on the other hand represents a different perspective of an incident by diagnosing its underlying root cause, which might also be the cause of multiple other incidents. Incidents however do not always grow up to become problems.  While Incident Management activities focus on restoring services to normal operations as quickly as possible, Problem Management activities determine the root cause, find the most effective and efficient permanent resolution and ultimately prevent the incident from happening again.    Problem Management can be both reactive and proactive. Proactive Problem Management identifies weaknesses in the environment before actual incidents occur.  These can then be exploited as improvement opportunities.   Reactive Problem Management addresses problems that were identified from one or more incidents.      The pol

Problem, Incident and Change Management Integration

“ Problem Management  seeks to minimize the adverse impact of incidents and problems on the business that are caused by underlying errors within the IT infrastructure and to proactively prevent the recurrence of incidents related to those errors.   In order to achieve this,  Problem Management  seeks to get to the root cause of incidents, document and communicate known errors and to initiate actions to improve or correct the situation”.    Given that statement is directly from the ITIL Best Management Practices text, it’s a wonder more organizations don’t have well integrated Problem, Incident and Change processes in their organizations. I never want to say that there is a single silver bullet solution for a given problem and I’m not suggesting that here.  However having a solid CMS (Configuration Management System) is a good step in the right direction.   Of course before we even think of tools we must have rules.  Thinking holistically we can create an integrated set of best p

Problem Management for Newbies (Part 2 of 2)

Problem Management for Newbies (Part 2 of 2) In part one of “Problem Management for Newbies” we looked at reactive Problem management and how Problem Management can serve as a pillar of support to incident management.  Problem Management prevents, minimizes and eliminates future incidents and problems from occurring.  There will always be a need for reactive problem management.  IT support can never guarantee that there will not be outages and will always need clearly defined roles, skilled staff and governance for the resolution of incidents and problems when they occur.  Added value to the business is via proactive problem management!  Proactive Problem Management Proactive problem management will glean management information from the function of the service desk, and others across the organization.  By viewing and analyzing reports on frequency of incidents, types of incidents,  noting the times that incidents and problems occur and most importantly understanding the bu

Problem Management for Newbies! Part 1 of 2

Getting Started with Problem Management To understand the process of Problem Management one must first understand that a problem is distinctively different than an Incident.   It is tracked and recorded separately, it requires a very different skill set and has a different objective than those that are required for “Incident Management”.   Problem records are unique entities and are reported upon separately.   A repeatable lean problem management process could very well be the glue that helps IT Service providers integrate and automate much of the work and effort required to “prevent” “Eliminate” and to “Minimize” the impact of incidents on your business and end user customers. While an incident is an unplanned interruption that creates an impact to one or more business services, the problem is actually the cause of one or more incidents.    Example:   “I can’t access the ERP system”, “The web portal will not come up!”    “I can’t log in” are all examples of incidents.    The c