Skip to main content

Posts

Showing posts with the label Availability Management

Agile / DevOps: (_____) as CODE #DevOps

Infrastructure as Code – is a common term among developers, architects, and operational staff and the practice has evolved in response to demand for quality and efficiency in the industry.  Over the last decade many organizations have come to realize that the essence of Infrastructure as Code is to treat the configuration of systems the same way that software source code is treated.  Frequent code integration, automated builds, and integrated testing have resulted in stronger IT performance and therefore business value. Security as Code – An increase in security breaches across all industries has brought forward a similar concept, and that is to look at “Security as Code”.  This concept would include the usage of repeatable algorithms to integrate security checks with each code check.  This expands the scope of traditional “Continuous Integration” and automation.  Organizations realize that security is no longer a second thought and must be addressed at the front of the value s

Visible Ops

Anyone who has worked in Information Technology knows that today, there is and always will be improvement opportunities available to our organizations.  This is especially in light of the pace of change that is taking place in all market spaces and the level of customer expectations that accompanies that change. If you have worked in IT for a number of years, you may remember when change was not welcomed. Well the good old days weren’t always that good and tomorrow ain’t as bad as it seems (Billy Joel).  The challenge is in getting started. If……. ·        the processes that are currently being engaged are not as efficient and effective as you would like ·        you are finding that your environment isn’t as stable and reliable as it should be ·        that when you make changes to your environment it generally results in an outage and prolonged and repeatable firefighting then ……. I recommend that you read The Visible Ops Handbook by Gene Kim, Kevin Behr and Geor

The Best of Service Design, Part 3

The Importance of Availability Management Originally Published on July 5, 2011 The Availability Management process ensures that the availability of systems and services matches the evolving agreed needs of the business. The role of IT is now integral to the success of the business. The availability and reliability of IT services can directly influence customer satisfaction and the reputation of the business. The proactive activities of Availability Management involve the proactive planning, design and improvement of availability. These activities are principally involved within design and planning roles. The proactive activities consist of producing recommendations, plans and documents on design guidelines and criteria for new and changed services, and the continual improvement of service and the reduction of risk in existing services wherever it can be cost-justified. There are several guiding principles that should underpin the Availability Management process and its focus:

Proactive Availability Management Techniques - CFIA

Component Failure Impact Analysis (CFIA) is a proactive availability management technique which was developed by IBM in the 1970’s. This technique allows us to predict the impact on our services if any of the individual components fail. It points out our vulnerabilities to single points of failure. Doing a CFIA is a pretty simple exercise. Here are the steps: Take certain key Configuration Items (CI)s in the infrastructure and identify the services that they support by researching the Configuration Management System (CMS).  If you do not have a CMS, look for paper diagrams, network configurations, any available documentation and general knowledge. Create a paper or electronic table or spreadsheet.  List the CIs in the first column, and the Services in the top row.  For every CI, place an “X” in the column below the service if that CI's failure would cause an outage.   Mark an “A” when the CI has an immediate backup (hot start) or a “B” when the CI needs a warm start. The basic

Component Failure Impact Analysis

Availability Management balances business availability requirements against the associated costs. So, should we consider availability requirements before the service has been designed and implemented or after?  The Availability Management process should begin in the Service Strategy stage of the lifecycle and continue in each stage of the service lifecycle. Availability Management ensures that the design approach takes two distinctive but related perspectives. Designing for availability focuses on all aspects of the technical design of the IT service. Designing for recovery ensures that in the event of a service failure, the business can resume normal operations at normal as quickly as possible. One of the techniques that can be invaluable to both perspectives is the Component Failure Impact Analysis (CFIA). The CFIA can be used to predict and evaluate the impact a component failure can have on its related IT service. This activity identifies areas of weakness or fragility within