|Workflow||How may we be of service?||GitLab.com Status|| |
|Issue Trackers||Infrastructure: Milestones, OnCall||Production: Incidents, Changes, Deltas||Delivery|
|Slack Channels||#sre-lounge, #database||#alerts, #production||#g_delivery|
|Operations||Runbooks (please contribute!)||On-call: Handover Document, Reports|
An operational environment is a complex and interconnected mesh of components working in unison to deliver a set of services. In a prior iteration of the teams, we purposely avoided organizing teams along siloed functional groups, aligning them instead along the environment's lifecycle, taking into account the two variables that drive change into the environment: time and space.
Our long-term objective is to become a world-class SRE organization. In order to reach that goal, we first adopted a focal arrangement where the organizational formula is derived from the focus and purpose of the groups arranged along the time and space variables, and group containing the appropriate functional resources necessary to manage the environment, which include systems and database specialties.
The first iteration in this model comprised two groups, Site Availability and Site Reliability. The second iteration added a third group, one specializing on the biggest source of change in the environment, releases, whose purpose is to make CI/CD at GitLab a reality: Delivery.
We are now upon our third iteration: the organization has grown and our availability has improved, so it is now time to remove some of the duct-tape we put in place early on and focus on moving towards a proper reliability organization across the board. Thus, Infrastructure is now composed of four teams, three of which are focused on Reliability, and one, Delivery, which will continue to specialize on release deployments to move us to CI/CD. We dropped the Site from the team names because their focus goes well beyond GitLab.com and comprises all user-facing services that power our infrastructure. Additionally, these teams are comprised of both DBREs and SREs.
|Gerardo "Gerir" Lopez-Fernandez||Director of Engineering, Infrastructure|
|Marin Jankovski||Engineering Manager, Delivery|
|Andrew Newdigate||Distinguished Engineer, Infrastructure|
|Dave Smith||Engineering Manager, Reliability Engineering, CI/CD & Enablement|
|Jose Cores Finotto||Engineering Manager, Reliability Engineering, Dev & Ops|
|Anthony Sandoval||Engineering Manager, Reliability Engineering, Secure & Defend|