|Workflow||How may we be of service?||GitLab.com Status|| |
|Issue Trackers||Infrastructure: Milestones, OnCall||Production: Incidents, Changes, Deltas||Delivery|
|Slack Channels||#infrastructure-lounge, #database||#alerts, #production||#g_delivery|
|Operations||Runbooks (please contribute!)||On-call: Handover Document, Reports|
|Production||SRE Onboarding||Readiness Guide||Database Reliability||On-call Handover|
The Infrastructure Department is the primary responsible party for the availability, reliability, performance, and scalability of all user-facing services (most notably GitLab.com, the largest production GitLab Installation on the planet). Other departments and teams contribute greatly to these attributes of our service as well. In these cases it is the responsibility of the Infrastructure Department to close the feedback loop with monitoring and metrics to drive accountability.
We are a blend of operations gearheads and software crafters that apply sound enginering principles, operational discipline and mature automation to make GitLab.com ready for mission-critical customer workloads. We strive for excellence every day by living and breathing GitLab's values as our guiding operating principles in every decision we make and every action we take.
Blueprints are intended to scope flesh out our initial thinking about specific problems and issues we are facing (topical) and outline overall Infrastructure priorities and focus for a given quarter (quarterly). Blueprints are sketches whose purpose is to foster and frame discussion around Infrastructure topics, most of which will yield designs and OKRs, which qualify and quantify objectives and key results.
Design plays a significant role in how we produce technical solutions to meet the challenges we face in making GitLab.com ready for mission-critical workloads.
The Infrastructure Department is comprised of four teams teams:
For details on the Department's structure, see the Infrastructure Teams Handbook section.
Every SRE is aligned with an engineering team. Each SRE can help the teams at each stage of the process. Planning, discovery, implementation, and further iteration. The area an SRE is responsible for is part of their title, e.g. "SRE, Plan, Monitor." You can see which area of the product each SRE is aligned with in the team org chart.
Multiple SREs are aligned with areas of the product. This area will be listed on the team page under their title as an expertise, e.g. "Plan expert." This way there is a team of SREs available to provide help in the case that another is out of the office or busy with another incident or team.
|Monitor||Ahmad Sherif||Amarbayar Amarsanaa||John Skarbek|
|Secure||John Skarbek||Craig Barrett||Alejandro Rodriguez|
|Configure||Craig Barrett||John Northrup||Devin Sylva|
|Verify (CI) / Release (CD)||John Northrup||Devin Sylva||Alex Hanselka|
|Serverless||Andrew Newdigate||John Jarvis||John Northrup|
|Distribution and Package||John Jarvis||Alex Hanselka||Craig Barrett|
|Create||Alex Hanselka||John Jarvis||Amarbayar Amarsanaa|
|Plan||Craig Barrett||Amarbayar Amarsanaa||John Skarbek|
|Manage||Devin Sylva||Ahmad Sherif||John Northrup|
|Gitaly||Andrew Newdigate||Alejandro Rodriguez||Ahmad Sherif|
|Gitter||Andrew Newdigate||Ahmad Sherif||Alejandro Rodriguez|
|Geo||Alex Hanselka||John Skarbek||John Jarvis|