Site Reliability Engineering on AWS

Learn to incorporate the principles of SRE into practice.

Duration

2 Days

Level

Intermediate Level

Design and Tailor this course

As per your team needs

Edit Content

The Site Reliability Engineering training course is designed to demonstrate a discipline where the main goals are to create ultra-scalable and highly-reliable software systems. Site Reliability Engineering (SRE) incorporates aspects of software engineering and applies them to infrastructure and operations problems. SRE was created and implemented at Google in the early 2000s to make their sites run more smoothly, efficiently, and reliably. 

This course begins by describing SRE and explains how it incorporates aspects of software engineering and applies them to infrastructure and operations problems. Next, the course covers a high-level overview of the history of SRE, the differences between SRE and DevOps, and roles and responsibilities. From there, students move into budgeting, planning, and monitoring. The course concludes with students working with practical examples and learning best practices.

Edit Content

Developers and developer teams are looking to incorporate the principles of SRE into practice.

Edit Content
  • Reliability in Modern Applications
  • The Impact of Failure and Determining Your Reliability Objectives
  • Accepting Failure and Making It Part of the Design Process
  • SRE is a Mindset 
  • AWS Global, Regional, and Zonal Architecture Design
  • Amazon’s Global Storage Services – S3
  • Running Resilient Databases on AWS – RDS and DynamoDB
  • Fault Tolerant Computation on AWS – Lambda and EC2
  • Core Resilience Principles for AWS – Load Balancing and Auto Scaling
  • Optimizing and Migrating the Code
  • Creating Container with CodeBuild
  • The Architecture of Microservices
  • Using Kubernetes and ECS in AWS
  • Deploying ECS and RDS
  • The Problem with What we’ve Just Built
  • Overview of  Failure Mode Analysis
  • Multi-Regional Support
  • Microservices Design
  • Authentication and Authorization
  • Code Deployment with CodePipeline
  • Application Telemetry and Tracing
  • Application Analytics
  • Aurora and its Advantages Over MySQL
  • Running/Scaling Applications On EKS
  • Deploying App-Mesh
  • Review: AWS Global Architecture and What we have just Built
  • Global Tools: Route 53, CloudFront
  • Going Global: What does this mean for Users/Developers
  • Operational Changes Required For a Global Application
  • Course Summary
Edit Content

Connect

we'd love to have your feedback on your experience so far