Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Scalability Team

Scalability Team logo: inspired by the album cover of Unknown Pleasures, the debut studio album by English rock band Joy Division, except the waveforms are Tanukis.

Issue Trackers https://gitlab.com/gitlab-com/gl-infra/scalability
Slack Channels #g_scalability (Primary Team Channel), #infrastructure-lounge (Infrastructure Group Channel), #incident-management (Incident Management), #alerts-general (SLO alerting), #mech_symp_alerts (Mechanical Sympathy Alerts)

Mission

The Scalability team is responsible for GitLab and GitLab.com at scale, working on the highest priority scalability items in the application in close coordination with Reliability Engineering teams and providing feedback to other Engineering teams so they can become better at scalability as well.

Vision

As its name implies, the Scalability team enhances the availability, reliability and, performance of GitLab by observing applications capabilities to operate at GitLab.com scale. The Scalability team analizes application performance on GitLab.com, recognizes bottlenecks in service availability, proposes short term improvements and develops long term plans that help drive the decisions of other Engineering teams.

Short term goals include:

Work prioritization process

Diagram below describes how the work gets prioritized in the Scalability team:

workflow

Process contains 6 cyclical stages:

  1. Observe - What is causing to SLA and SLO degradations on GitLab.com?
  2. Analysis - Why is availability being reduced, do we have all information, and are our metrics sufficient?
  3. Proposed Improvements - Issue with a (partial, temporary or full, permanent) fix is created including proposals for estimated SLA improvements for services affected.
  4. Triage - Prioritise changes based on pre-defined set of rules, which include ownership of the change.
  5. Development & Deployment - The work on developing and ensuring that the change has no unexpected effects is executed by the owner defined in the previous stage.
  6. Assessment - Assesment of the implemented change is done through retrospecting on the expected and observed state. The retrospective process is documented in an issue that is marked related with the original issue driving the change.

Team work processes

The work process will be defined when the team is (partially) staffed and working on the first task, to ensure that the process fits the project and the team structure.

Team counterparts

The Scalability team will work with all engineering teams across all departments as a representative of GitLab.com as one of the largest GitLab installations, to ensure that GitLab continues to scale in a safe and sustainable way.

The Memory team is a natural counterpart to the Scalability team, but their missions are complementing each other rather than overlap:

Simply put:

Team Members

The following people are members of the Scalability Team:

Person Role
New Vacancy - Marin Jankovski (Interim) Engineering Manager, Scalability
Andrew Newdigate Distinguished Engineer, Infrastructure
Bob Van Landuyt Senior Backend Engineer, Scalability

How do I engage with the Scalability Team?

  1. Start with an issue in the Scalability team tracker: https://gitlab.com/gitlab-com/gl-infra/scalability/issues/new.
  2. You are welcome to follow this up with a Slack message in #g_scalability.
  3. Please don't add any workflow labels to the issue. The team will triage the issue and apply these.
  4. We use our Workflow board to track the workflow of issues.

Celebrating our wins

We celebrate our wins! Whenever a change driven by the Scalability Team shows a clear positive impact on the scalability of GitLab.com; through key metrics, saturation reduction, reduced Mean time to Detection (MTTD), improved Mean time between Failures, etc, we post a message as a comment on this snippet in our tracker: https://gitlab.com/gitlab-com/gl-infra/scalability/snippets/1900609.