Site Reliability Engineering (SRE): The Big Picture
SRE is how Google runs production systems, promoting high availability with high velocity and removing operational toil. It achieves the same goals as DevOps without the culture shift, so it’s a better option for many digital transformations.
keeping production systems stable and still delivering new features at speed. In this course, Site Reliability Engineering (SRE): The Big Picture, you ‘ll get a thorough overview of how SRE works and why it’s a good choice for many organizations. First, you’ll learn the differences between SRE, DevOps, and traditional operations. Next, you’ll discover how engineering practices help to reduce toil and provide more time to focus on high value tasks. Finally, you’ll learn how SRE approaches monitoring and alerting, and about the SRE approach to managing incidents. When you’re finished with this course, you’ll be able to evaluate SRE and see if it’s a good fit for your organization.
Author Name: Elton Stoneman
Author Description:
Elton is a 10-time Microsoft MVP, author, trainer and speaker. He spent most of his career as a consultant working in Microsoft technologies, architecting and delivering complex solutions for industry leaders. He has delivered APIs on Azure serving millions of clients daily, Big Data solutions processing billions of events weekly, and cutting-edge solutions powered by containers. Elton’s experience with .NET goes from .NET 1.0 running on Windows Server, right up to .NET Core running on Linux. Wh… more
Table of Contents
- Course Overview
2mins - Introducing Site Reliability Engineering
27mins - Automation and Eliminating Toil
30mins - Service Levels, Monitoring, and Alerting
28mins - Incident Management: On-call and Postmortems
22mins
There are no reviews yet.