Component Failure Impact Analysis (CFIA) is a proactive availability management technique which was developed by IBM in the 1970’s. This technique allows us to predict the impact on our services if any of the individual components fail. It points out our vulnerabilities to single points of failure.
Doing a CFIA is a pretty simple exercise. Here are the steps:
Doing a CFIA is a pretty simple exercise. Here are the steps:
- Take certain key Configuration Items (CI)s in the infrastructure and identify the services that they support by researching the Configuration Management System (CMS). If you do not have a CMS, look for paper diagrams, network configurations, any available documentation and general knowledge.
- Create a paper or electronic table or spreadsheet. List the CIs in the first column, and the Services in the top row. For every CI, place an “X” in the column below the service if that CI's failure would cause an outage. Mark an “A” when the CI has an immediate backup (hot start) or a “B” when the CI needs a warm start.
The next step is to answer the following questions about the potential failure of the configuration item:
- Is this CI a single point of failure?
- What is the business impact if this service fails? How many people will be affected?
- What is the cost of unavailability for this service?
- What is the likelihood failure will occur?
- Are we willing to take the risk of this failed service?
- What can we do to improve our vulnerability?
- Should we look at CI redundancy options? What would this cost?
- Is this cost justified?
- Could preventative measures helped avoid the problem?
Comments