Overview
Site Reliability Engineering (SRE) Fundamentals™ certification is to impart, test and validate knowledge of SRE vocabulary, principles and practices.
Site Reliability Engineering (SRE) Fundamentals™ certification helps Engineers to understand the basic foundations of Site Reliability Engineering
Exam Requirements
- Attend a face-to-face or virtual course taught by a Certified Site Reliability Engineering (SRE) Fundamentals™ trainer
- Have 16 hours of live online or 16 hours of in-person training with Certified Site Reliability Engineering (SRE) Fundamentals™ trainer
- After successfully completing the course, you will need to accept the License Agreement to take the 45 question Site Reliability Engineering (SRE) Fundamentals™ test
- To pass the test, correctly answer 32 out of the 45 questions within the 60-minute time limit
- Maintain your Site Reliability Engineering (SRE) Fundamentals™ certification by renewing your certification annually
Modules
Module 1: Introducing SRE
- DevOps
- SRE
- SRE Terminologies
- Toil
- Type of Toils
Module 2: Service Level Objectives
- Service Level Objectives
- SLO Data Components and Metrics
- Measuring and evaluating Service Level Objectives (SLOs)
- Steps for measuring and evaluating SLOs
- Service Level Objectives challenges
- SLO best Practices
Module 3: Service Level Indicators
- Service Level Indicators
- SLIs vs. SLOs vs. SLAs
- Identifying SLI
- Define Programmatic SLIs
Module 4: Error Budgets
- What is an error budget?
- Why do you need an error budget?
- Benefits of error budgeting
- Error Budget Policies
- Positive Error budget
Module 5: Reduce Toil
- What is operations toil?
- Why Toil Matters
- Why toil has to be less
- How to Calculate TOIL
- Strategies for reducing operations toil
Module 6: Chaos Engineering
- Chaos Engineering
- Need for Chaos Engineering
- Benefits of Chaos Engineering
- Chaos Engineering and Testing
- Chaos Engineering and DevOps
- How Chaos Engineering works
- Chaos Engineering Experiments
- What is Chaos Monkey
Module 7: Managing Risk
- Risk Management
- Unplanned Downtime
- Identify Risk in Services
Module Quizzes
Use Cases
Study Material
- Student Study Book
- Mock Exam Paper(s)
- Module Wise Quiz
- Use Cases
- Case Studies
Duration
16 HRS
Target Audience
- Anyone starting or leading a move towards increased reliability
- Anyone interested in modern IT leadership and organizational change approaches
- Business Managers
- Business Stakeholders
- Change Agents
- Consultants
- DevOps Practitioners
- IT Directors
- IT Managers
- IT Team Leaders
- Product Owners
- Scrum Masters
- Software Engineers
- Site Reliability Engineers
- System Integrators
- Tool Providers