Skip to main content

The Best of Service Operation, Part 4

Event Management Activities
Originally Published on November 9, 2010

In an earlier blog I was asked how to move from a reactive organization to a proactive one. My answer was through the use of Event Management, along with good design and proactive Problem Management. In this installment I would like to speak to the activities within Event Management and the impact they play in our ability to deliver a consistent level of services and a stable infrastructure to deliver them across. By definition an event is any detectable or discernible occurrence that has significance for the management of the IT infrastructure or the delivery of IT services. Event Management is the process that monitors all events that occur through the IT infrastructure to allow for normal operation and to detect and evaluate the impact any deviation might cause to the IT infrastructure or delivery of IT services. Event Management has several activities that we engage when implementing this process. 
  • Event Notification: In the design stage (through engaging all stakeholders) we identify the events we want to detect for each CI and define and document meaningful notification data and associated roles and responsibilities. CIs can communicate status information by either polling a device or generating a notification under certain conditions. Notification types include regular operation, unusual but not exceptional operation and an exception.
  • Event Detection: Events can be detected by an agent on the same system or transmitted to an event management tool.
  • Event Filtering: This where 1st level correlation is performed. Determination of the significance of the event and whether the event is informational, a warning or an exception. This correlation is done by an agent on the CI. If no action is required it is logged and recorded.
  • Event Correlation: If the event is significant an appropriate response is determined. This is done by a correlation engine which is part of a management tool, which compares the event with a specific set of criteria in a predescribed way and then determines a response on a set of predefined rules.
  • Triggers: If correlation recognizes an event, some response will be required. This response will be initiated by a trigger. Triggers are designed specifically for the task it is to initiate. EX: Incident triggers, Problem triggers and Change triggers.
  • Response selection: At this point in the process a number of response options are available and these responses can be chosen in any combination. Events can be logged. An auto response can be initiated or an alert can be sent for the purpose of initiating some type of human intervention.
  • Review actions: Check that significant events or exceptions have been handled appropriately. Track tends or count event types. Reviews should not duplicate any actions taken if an incident, problem or change has been initiated.
  • Close event: informational events are logged and passed to other processes. Events that generate activity in other processes (incident, problem, change) are closed by those processes.
By creating the correct level of filtering we help to create an early entry point for the other service management processes. Through the use of both passive and active monitoring we can automate many aspects of the management of our environment leaving human resources to focus on the more complex issues facing our organizations.

Comments

Popular posts from this blog

What is the difference between Process Owner, Process Manager and Process Practitioner?

I was recently asked to clarify the roles of the Process Owner, Process Manager and Process Practitioner and wanted to share this with you.

Roles and Responsibilities:
Process Owner – this individual is “Accountable” for the process. They are the goto person and represent this process across the entire organization. They will ensure that the process is clearly defined, designed and documented. They will ensure that the process has a set of Policies for governance.Example: The process owner for Incident management will ensure that all of the activities to Identify, Record, Categorize, Investigate, … all the way to closing the incident are defined and documented with clearly defined roles, responsibilities, handoffs, and deliverables. An example of a policy in could be… “All Incidents must be logged”. Policies are rules that govern the process. Process Owner ensures that all Process activities, (what to do), Procedures (details on how to perform the activity) and the policies (r…

How Does ITIL Help in the Management of the SDLC?

I was recently asked how ITIL helps in the management of the SDLC (Software Development Lifecycle).  Simply put... SDLC is a Lifecycle approach to produce the software or the "product".  ITIL is a Lifecycle approach that focuses on the "service".
I’ll start by reviewing both SDLC and ITIL Lifecycles and then summarize:
SDLC  -  The intent of an SDLC process is to help produce a product that is cost-efficient, effective and of high quality. Once an application is created, the SDLC maps the proper deployment of the software into the live environment. The SDLC methodology usually contains the following stages: Analysis (requirements and design), construction, testing, release and maintenance.  The focus here is on the Software.  Most organizations will use an Agile or Waterfall approach to implement the software through the Software Development Lifecycle.
ITIL  -  is a best practice for IT service management (ITSM) that focuses on aligning IT services with the needs …

Incidents when a Defect is Involved

Question: We currently track defects in a separate system than our ticket management system. With that said, my question is does anyone have suggestions and/or best practices on how to handle incidents when a defect is involved? Should the incident be closed since the defect is being worked on in another defect tracking system if it is noted in the incident ticket? I am considering creating an incident statuses of 'closed-unresolved' so the incident can still be reported on in our ticket management system but know it is being worked on/tracked in the defect system. With defects, it is possible that we may never work on them because they are very low priority and the impact is low to the user. However, in some cases a defect is being worked on. Should we create a problem ticket instead?
Thanks, René W.

Answer: RenĂ©. In ITIL, the activity you are describing is handled by the Problem Management process. ITIL does not use the term “defect” but it does use the term “known error” to…