As we have a renewed focus on reliability in engineering to reduce outages, we have made many changes to the handbook, production documentation, and our processes. While we have announced them via multimodal communication (EWIR, slack, email, meetings), not everyone has likely seen and internalized all of the important changes.
We want to gather all the crucial changes, explain why we made them, discuss a summary, and link to where you can find more information.
This material is available as a learning pathway on GitLab's Level Up.
Amplifying SaaS Reliability Focus
Reliability & Security Standup
Importance of reliability to the business
Improving SUS - slides 9 through 14 in particular
MR to change quality and reliability
MR around things that don't scale
Google SRE Book: Blameless culture
Limiting the impact of far-reaching work
Course on backwards compatibility
Stage group dashboard documentation
Add your comments in this feedback issue.