Skip to main content

Event Management Reactive to Proactive

I have been asked by many students, how do you move from that role as the fire fighter resolving incidents to the role of being able to prevent them from occurring in the first place? Much of this has to do with good design and a strong proactive problem management process, but a solid event management process is an excellent offensive weapon in the prevention of impacting incidents in your environment.

Event Management is the process that gives IT the ability to detect events, make sense of them and determine the appropriate control action. It is the basis for our operational monitoring and control. This gives us a way to compare actual performance against what was designed and written in SLAs. What is perfect about event management is that we can apply it to any aspect of our environment from delivery of a service, monitoring an individual CI, environmental conditions to software license usage.

In conjunction with the other Service Management processes, along with both passive and active monitoring tools, Event Management can indicate a change in the status of a CI, allowing the early response of the appropriate person or team. This enhances our ability to act proactively in the prevention of exceptions or incidents and insure that we can deliver those desired business outcomes without interruption. Event Management provides the foundation for creating automated operations, increasing effectiveness and efficiencies by allowing more expensive human resources to do the more complex tasks of finding ways to create a competitive advantage for the business.

Event management does not begin the day we go live with a new or changed service. In the design stage of the SM lifecycle we identify the events we want to detect. We define these notifications. Is it regular operations? Is it something unusual, but not exceptional? Or could it be some type of exception. In Transition we build and test these notifications, the tools we will use to generate them and define roles and responsibilities. In operations we implement.
  • Event detection & filtering  
  • Determine significance: Informational, Warning or Exception 
  • Correlation: Determine response on a set of predefined rules
  • Triggers: Mechanism used to initiate a response
  • Response selection: event logged, auto response, alert and human intervention. Open an RFC, open an incident or open a problem
  • Review actions: handled correctly, track trends or counts
  • Close event

Through the implementation of these activities we can begin to proactively monitor availability, reliability, capacity and overall performance and move our organizations into a position of prevention vs reaction. 

Comments

Popular posts from this blog

What is the difference between Process Owner, Process Manager and Process Practitioner?

I was recently asked to clarify the roles of the Process Owner, Process Manager and Process Practitioner and wanted to share this with you.

Roles and Responsibilities:
Process Owner – this individual is “Accountable” for the process. They are the goto person and represent this process across the entire organization. They will ensure that the process is clearly defined, designed and documented. They will ensure that the process has a set of Policies for governance.Example: The process owner for Incident management will ensure that all of the activities to Identify, Record, Categorize, Investigate, … all the way to closing the incident are defined and documented with clearly defined roles, responsibilities, handoffs, and deliverables. An example of a policy in could be… “All Incidents must be logged”. Policies are rules that govern the process. Process Owner ensures that all Process activities, (what to do), Procedures (details on how to perform the activity) and the policies (r…

How Does ITIL Help in the Management of the SDLC?

I was recently asked how ITIL helps in the management of the SDLC (Software Development Lifecycle).  Simply put... SDLC is a Lifecycle approach to produce the software or the "product".  ITIL is a Lifecycle approach that focuses on the "service".
I’ll start by reviewing both SDLC and ITIL Lifecycles and then summarize:
SDLC  -  The intent of an SDLC process is to help produce a product that is cost-efficient, effective and of high quality. Once an application is created, the SDLC maps the proper deployment of the software into the live environment. The SDLC methodology usually contains the following stages: Analysis (requirements and design), construction, testing, release and maintenance.  The focus here is on the Software.  Most organizations will use an Agile or Waterfall approach to implement the software through the Software Development Lifecycle.
ITIL  -  is a best practice for IT service management (ITSM) that focuses on aligning IT services with the needs …

Incidents when a Defect is Involved

Question: We currently track defects in a separate system than our ticket management system. With that said, my question is does anyone have suggestions and/or best practices on how to handle incidents when a defect is involved? Should the incident be closed since the defect is being worked on in another defect tracking system if it is noted in the incident ticket? I am considering creating an incident statuses of 'closed-unresolved' so the incident can still be reported on in our ticket management system but know it is being worked on/tracked in the defect system. With defects, it is possible that we may never work on them because they are very low priority and the impact is low to the user. However, in some cases a defect is being worked on. Should we create a problem ticket instead?
Thanks, René W.

Answer: RenĂ©. In ITIL, the activity you are describing is handled by the Problem Management process. ITIL does not use the term “defect” but it does use the term “known error” to…