Skip to main content

How to Hire Site Reliability Engineers (SREs): 5 Top Qualities

Guest Host Post by Jayne Groll previously posted on The Enterprisers Project, May 13, 2021

The Site Reliability Engineer (SRE) role continues to gain momentum in enterprise IT. Hiring managers, consider this advice on how to spot a strong candidate.


Site Reliability Engineering (SRE) continues to gain momentum among IT organizations. According to the Upskilling 2021: Enterprise DevOps Skills Report, 47 percent of survey respondents (up from 28 percent in 2020) say SRE is a must-have process and framework skill. As the demand for strong SRE skills rises, so does SRE hiring.

However, a challenge for business and hiring managers is determining which skills, traits, and competencies make a strong site reliability engineer. I asked several DevOps Institute Ambassadors and SRE subject matter experts to weigh in on what makes a great SRE. Here’s what they had to say:

1. "Great SREs have a passion for high-quality automation. They have a lot of ideas about automation of toilsome production tasks that can improve reliability and save a lot of time for operations. They are good communicators and like to spend time with developers to understand how new products and services can be deployed and operated in high-scale, high-reliability environments." - Marc Hornbeek, CEO and principal consultant at Engineering DevOps Consulting and author of Engineering DevOps

2. "A great SRE ensures SLOs (Service Level Objectives) are set at correct boundaries of service; they define alerts to detect SLI (Service Level Indicator) thresholds. They enable developers on CI/CD automation, quality thresholds, and deployment automation using infrastructure as code. They enable developers to understand how their applications are performing in production building observability. They thoroughly understand deployment and fail-safe strategies. They influence in building fault-tolerant, autoscaling, cost-efficient, high-performing design and architecture.

"An SRE should ensure the consumption of platform standards and consistency of tooling. SREs handle on-call events and do post-mortems. They ensure error budgets are followed, they ensure self-regulation of velocity and stability, and they ensure excess Ops work overflows to the Dev team." - Shivagami Gugan, CTO at CX Tech Unicorn

3. Prize Communication. "A great SRE must have a mix of developer and operations skills. Ideally is not just an ops person and not just a development person. The person must transition between ops and dev very smoothly. A great SRE knows how to communicate well, either writing documentation or talking with their colleagues (especially when working remotely)." - Andre Almar, Co-founder and technical trainer at DevOps Bootcamp

4. Look for longer-term. support experience. “When Google pioneered the SRE approach, they were adamant that all SREs be skilled developers. So, spotting a good SRE is very similar to how one would identify/screen for a good developer. In our company, we use HackerRank to test the proficiency of the devs we hire. Culturally though, the best SREs are developers who have spent time actually maintaining the products that they have built. Many organizations and service providers still adopt short-term project-oriented team structures, so developers end up being shuffled from one product to another instead of sticking with the same product and learning how to support/improve/stabilize it over time." - Lisa Chan, Head of software engineering & DevOps at PETRONAS

5. Look for a person that demonstrates empathy. "Typically, the greatest concentration is on the technical skills, and yes, these are important and to be considered when looking at the toolset to be employed. However, knowledge in the use of tools is something that can be easily trained. Furthermore, any enterprise implementing good SRE is also considering that tools can be easily swapped out, so the need to know and have experience in specific technologies is really not as fundamental as other areas that can’t be trained.

To spot a great SRE, it is key to find someone who has empathy. The greatest barrier to the implementation of any way of working is culture, and for Agile, DevOps and SRE, it is about an open culture. The greatest enemy to having a flowing and open culture is a closed mind. If a candidate is the kind of person who will consider their own role as primary and all others as secondary is possibly not a best fit. Therefore, and something of good advice to candidates also, is to have a holistic perspective for the role you are in and have a balanced perspective on how you fit and impact the other roles around you. Beyond holistics, it is also about having respect for what others do and the challenges they may face. In all, empathy!" - Stephen Walters, Solution architect at xMatters, Inc.

To learn more, consider the following ITSM Academy Certification Course:


Comments

Popular posts from this blog

Four Service Characteristics

Recently I came across several articles by researchers and experts that laid out definitions and characteristics of services. ITIL provides us with a definition that can help drive the creation of value-laden services: A means of delivering value to customers by facilitating outcomes customers want to achieve without the ownership of specific costs and risks. An area that ITIL is not so clear is in terms of service characteristics. Several researchers and experts put forth that services have four basic characteristics (IHIP): Intangibility—Services are the results of actions not things. They have no physical presence and represent a logical set of elements. One way to think of service is “work done for others.”  Heterogeneity—Also known as “variability”; services are unique items because of the mechanisms used to deliver services, which is people. Because the people element adds variability, the service is variable. This holds true, especially for the value proposition—not eve...

What Is A Service Offering?

The ITIL 4 Best Practice Guidance defines a “Service Offering” as a description of one or more services designed to address the needs of a target customer or group.   As a service provider, we can’t stop there!   We must know what the contracts of our service offering are and be able to put them into context as required by the customer.     Let’s explore the three elements that comprise a Service Offering. A “Service Offering” may include:     Goods, Access to Resources, and Service Actions 1. Goods – When we think of “Goods” within a service offering these are the items where ownership is transferred to the consumer and the consumer takes responsibility for the future use of these goods.   Example of goods that are being provided in the offering – If this is a hotel service then toiletries or chocolates are yours to take with you.   You the consumer own these and they are yours to take with you.      ...

What is the difference between Process Owner, Process Manager and Process Practitioner?

This article was originally published in 2015. With the Introduction of ITIL 4, some of this best practice has changed. See  ITIL 4 and the Evolving Role of Roles . Updated Definitions in ITIL 4: Process Owner: In ITIL 4, the concept of 'processes' has expanded into broader 'practices.' Consequently, the Process Owner is now often referred to as the 'Practice Owner.' This individual is accountable for the overall design, performance, integration, and improvement of a specific practice within the organization. They ensure that the practice achieves its intended outcomes and aligns with the organization's objectives. Process Manager: Now commonly known as the 'Practice Manager' in ITIL 4, this role is responsible for the day-to-day management of the practice. The Practice Manager ensures that activities are carried out as intended, manages resources assigned to the practice, and oversees the practitioners performing the work. Process Practit...