Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Quality Department Performance Indicators

On this page

Executive Summary

KPI Health Reason Next Steps
Hiring Actual vs Plan Okay Engineering is on plan. But we are lending some of our recruiters to sales for this quarter. And we just put in place a new "one star minimum" rule that might decrease offer volume.
  • Health: Monitor health closely
  • Maturity: Get this into periscope
  • Successful vs Failed CE/EE Review App deployments per month Unknown We have only manual data collection week to week. Rate is fluctuating between 70%-90%
  • Define an automated mechanism to collect data in Periscope.
  • Define threshold.
  • Successful vs Failed CE/EE `master` pipelines per month Unknown We haven’t started measuring it yet.
  • Define an automated mechanism to collect data in Periscope.
  • Define threshold.
  • Average CE/EE pipeline duration per month Unknown We haven’t started measuring it yet.
  • Define an automated mechanism to collect data in Periscope.
  • Define threshold.
  • P1/P2 open bugs past target SLO Problem We have prioritized efforts. We will likely not trend downwards until the backlog of older bugs are closed.
  • Improve the chart by changing time series to bug age not creation month.
  • Define threshold and migrate into periscope.
  • New issue first triage SLO Unknown We haven’t started measuring it yet. We have made progress on fanning out first triage to Engineers in the Quality Department.
  • Define an automated mechanism to collect data in Periscope.
  • Define threshold.
  • Fan out triaging to all of Engineering and not just the Quality Department.
  • Key Performance Indicators

    Hiring Actual vs Plan

    Are we able to hire high quality workers to build our product vision in a timely manner? Hiring information comes from BambooHR where employees are in the division `Engineering`.

    URL(s)

    Health: Okay

    Engineering is on plan. But we are lending some of our recruiters to sales for this quarter. And we just put in place a new "one star minimum" rule that might decrease offer volume.

    Maturity: Level 2 of 3

    We have charts driven off of team.yml

    Next Steps

    Successful vs Failed CE/EE Review App deployments per month

    Measures the stability of our test tooling to enable engineering efficiency.

    URL(s)

    Health: Unknown

    We have only manual data collection week to week. Rate is fluctuating between 70%-90%

    Maturity: Level 1 of 3

    We have started work to capture this metric in Periscope. Currently waiting on additional data import from GitLab.com

    Next Steps

    Successful vs Failed CE/EE `master` pipelines per month

    Measures the stability of our `master` pipelines to accelerate cycle time of merge requests and continuous deployments.

    URL(s)

    Health: Unknown

    We haven’t started measuring it yet.

    Maturity: Level 1 of 3

    We have started work to capture this metric in Periscope. Currently waiting on additional data import from GitLab.com

    Next Steps

    Average CE/EE pipeline duration per month

    Measures the average duration of our pipelines to accelerate cycle time of merge requests.

    URL(s)

    Health: Unknown

    We haven’t started measuring it yet.

    Maturity: Level 1 of 3

    We have started work to capture this metric in Periscope. Currently waiting on additional data import from GitLab.com

    Next Steps

    P1/P2 open bugs past target SLO

    Measure the number of bugs past the priority SLO timeline.

    URL(s)

    Health: Problem

    We have prioritized efforts. We will likely not trend downwards until the backlog of older bugs are closed.

    Maturity: Level 2 of 3

    Automated data collection achieved in time series based on the month the bug was created.

    Next Steps

    New issue first triage SLO

    Measure our speed to triage new issues. We currently have ~400 new issues every week in CE/EE. We need to go through all of them and identify valid issues and high severity bugs.

    URL(s)

    Health: Unknown

    We haven’t started measuring it yet. We have made progress on fanning out first triage to Engineers in the Quality Department.

    Maturity: Level 1 of 3

    We have started work to capture this metric in Periscope. Currently waiting on additional data import from GitLab.com

    Next Steps

    Regular Performance Indicators

    Diversity

    Diversity & Inclusion is one of our core values, and a general challenge for the tech industry. GitLab is in a privileged position to positively impact diversity in tech because our remote lifestyle should be more friendly to people who may have left the tech industry, or studied a technical field but never entered industry. This means we can add to the diversity of our industry, and not just play a zero-sum recruiting game with our competitors.

    URL(s)

    Health: Attention

    Engineering is now at the tech benchmark for gender diversity (~16%), but our potential is greater and we can do better. 20% should be our floor in technical roles. Other types of diversity are unknown.

    Maturity: Level 2 of 3

    The content is shared only in a closed metrics review, and does not have granularity. It’s not visualized, or in time series.

    Next Steps

    Handbook Update Frequency

    The handbook is essential to working remote successfully, to keeping up our transparency, and to recruiting successfully. Our processes are constantly evolving and we need a way to make sure the handbook is being updated at a regular cadence.

    URL(s)

    Health: Unknown

    Unknown. But my sense is we are not doing enough. For instance, we have not been able to fully update the handbook after the development department re-org (dev backend, and ops backend are still present. Although many of the new teams do have their own pages already)

    Maturity: Level 2 of 3

    We currently just have contribution graphs, which are a poor proxy for this.

    Next Steps

    Team Member Retention

    People are a priority and attrition comes at a great human cost to the individual and team. Additionally, recruiting (backfilling attrition) is a ludicrously expensive process, so we prefer to keep the people we have :)

    URL(s)

    Health: Okay

    I seem to recall our attrition is now below 10% which is great compared to the tech benchmark of 22% and the remote benchmark for 16%, but the fact that I can’t just look at a simple graph makes me nervous...

    Maturity: Level 2 of 3

    There is manually curated data in a spreadsheet from PO

    Next Steps

    Average CE/EE `master` end-to-end test suite execution duration per month

    Measures the average duration of our full QA/end-to-end test suite in the `master` branch to accelerate cycle time of merge requests and continuous deployments.

    URL(s)

    Health: Unknown

    We haven’t started measuring it yet.

    Maturity: Level 1 of 3

    Have an idea or plan.

    Next Steps

    Ratio of quarantine vs total end-to-end tests in `master` per month

    Measures the stability and effectiveness of our QA/end-to-end tests running in the `master` branch.

    URL(s)

    Health: Unknown

    We haven’t started measuring it yet.

    Maturity: Level 1 of 3

    Have an idea or plan.

    Next Steps

    Monthly new bugs per stage group

    Tells us the hit rate of defects per each stage group on a monthly basis

    URL(s)

    Health: Problem

    Defect trends on a high level is increasing for mature areas of the product, but also due to double counting from duplicate issues. We need to raise more awareness and encourage teams to look at this metric and evaluate themselves regularly.

    Maturity: Level 3 of 3

    Data collection is achieved, need to migrate this into GitLab Insights (productized version) and configure dashboard for each stage groups. No threshold defined yet.

    Next Steps

    Mean time to resolve S1-S2 functional defects

    Tells us the monthly average time to resolve high severity defects.

    URL(s)

    Health: Problem

    We have prioritized efforts and trend is stabilizing. We have equiped engineering groups with better weeky triage reports. We will likely not trend downwards until the backlog of older bugs are closed.

    Maturity: Level 2 of 3

    Automated data collection achieved.

    Next Steps

    Ratio of bugs triaged with Severity (and priority)

    Measure our ability to differentiate high severity defects from the pool so we can prioritize fixing them above trivial bugs.

    URL(s)

    Health: Attention

    The number of untriaged bugs have been stable but we can still do better. We need to establish an actionable threshold for each team.

    Maturity: Level 2 of 3

    Data collection is achieved. Need to present in time series. No threshold defined yet.

    Next Steps

    The ratio of closed (not merged MRs) vs merged MRs over time.

    Measure amount of throw away work vs merged.

    URL(s)

    Health: Attention

    From experimental charts we don’t have that much throw away work. Not giving a passing grade because we don’t have a threshold to grade against.

    Maturity: Level 3 of 3

    Automation possible, experimental work (~4hrs) done in Quality Dashboard. To be added into Native GitLab Insights. No threshold has been established.

    Next Steps

    Other PI Pages

    Legends

    Maturity

    Level Meaning
    Level 3 of 3 Measurable, time series, identified target for the metric, automated data extraction, dashboard in Periscope available to the whole company (if not the whole world)
    Level 2 of 3 About two-thirds done. E.g. Missing one of: automated data collection, defined threshold, or periscope dashboard.
    Level 1 of 3 About one-third done. E.g. Has one of: automated data collection, defined threshold, or periscope dashboard.
    Level 0 of 3 We only have an idea or a plan.

    Health

    Level Meaning
    Okay The KPI is at an acceptable level compared to the threshold
    Attention This is a blip, or we’re going to watch it, or we just need to enact a proven intervention
    Problem We'll prioritize our efforts here
    Unknown Unknown

    How to work with pages like this

    Data

    The heart of pages like this is a data file called /data/performance_indicators.yml which is in YAML format. Almost everything you need to do will involve edits to this file. Here are some tips:

    Pages

    Pages like /handbook/engineering/performance-indicators/ are rendered by and ERB template.

    These ERB templates call the helper function performance_indicators() that is defined in /helpers/custom_helpers.rb. This helper function calls in several partial templates to do it's work.

    This function takes a required argument named org in string format that limits the scope of the page to a portion of the data file. Possible valid values for this org argument are listed in the orgs property of each element in the array in /data/performance_indicators.yml.