Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Engineering Function Performance Indicators

On this page

Executive Summary

KPI Health Reason Next Steps
Hiring Actual vs Plan Okay Engineering is on plan. But we are lending some of our recruiters to sales for this quarter. And we just put in place a new "one star minimum" rule that might decrease offer volume.
  • Health: Monitor health closely
  • Maturity: Get this into periscope
  • Average MRs/Dev/Month Attention The metric has not fully recovered to ~10. We are seeing positive trends, but need to address growth negative correlation with this metric.
  • Health: What we need to do next to improve metric - compare to the future trend of hiring/onboarding rates.
  • Maturity: First pass of periscope updates made. Need to fix ETL bug.
  • Mean time to merge (MTTM) Problem Rough charting tells us that the 85th percentile over the last 7 months takes upwards of 10+ days to merge. We are doing more investigating into cycle time of engineering, increasing maintainers. Working group established.
  • First pass of periscope updates made. Need to fix ETL bug.
  • Set an agreed upon threshold and measure
  • GitLab.com Availability Attention We’re above the SLO threshold, but we also know the data needs to be better
  • Maturity: Commit and implement SLIs/SLOs for GitLab.com.
  • Take generalized metrics and produce and track the resulting uptime metric.
  • Infrastructure Cost per GitLab.com Monthly Active Users Okay Met savings goals last quarter in working group.
  • This is one of the first priorities for the Operations Analyst, Infrastructure role.
  • Manually make rough calculation in the Working Group.
  • GitLab.com Performance Attention We are experiencing occasionally slowness of both the frontend and git operations.
  • TBD
  • Infrastructure cost vs plan Okay Met savings goals last quarter in working group.
  • This is one of the first priorities for the Operations Analyst, Infrastructure role.
  • Manually make rough calculation in the Working Group
  • MTTM (Mean-Time-To-Mitigation) for S1-S2-S3 security vulnerabilities Okay Currently, our MTTM metrics show that we are effective.
  • Link to issues in gitlab-ce with security and S1 labels
  • Link to Issues in gitlab-ce with security and S2 labels
  • Link to Issues in gitlab-ce with security and S3 labels
  • We are able to chart this data effectively at this point, with the Splunk instance.
  • We are effective at maintaining the MTTM for S1/S2/S3 vulnerabilities to 30/60/90 days or less consistently.
  • Key Performance Indicators

    Hiring Actual vs Plan

    Are we able to hire high quality workers to build our product vision in a timely manner? Hiring information comes from BambooHR where employees are in the division `Engineering`.

    URL(s)

    Health: Okay

    Engineering is on plan. But we are lending some of our recruiters to sales for this quarter. And we just put in place a new "one star minimum" rule that might decrease offer volume.

    Maturity: Level 2 of 3

    We have charts driven off of team.yml

    Next Steps

    Average MRs/Dev/Month

    Average MRs per Developer per month is a monthly evaluation of how MRs on average an author performs. It’s important because it measures productivity. The Senior Director of Development is the DRI on what projects are included.

    URL(s)

    Health: Attention

    The metric has not fully recovered to ~10. We are seeing positive trends, but need to address growth negative correlation with this metric.

    Maturity: Level 2 of 3

    We currently have automation and dashboard, thresholds need to be established. We need to move metric to periscope.

    Next Steps

    Mean time to merge (MTTM)

    To be aligned with CycleTime from Development. Monthly mean time to merge MRs, it tells us on average how long it takes from submitting code to being merged. The Senior Director of Development is the DRI on what projects are included.

    URL(s)

    Health: Problem

    Rough charting tells us that the 85th percentile over the last 7 months takes upwards of 10+ days to merge. We are doing more investigating into cycle time of engineering, increasing maintainers. Working group established.

    Maturity: Level 2 of 3

    In Periscope. We need to determine thresholds for success/failure.

    Next Steps

    GitLab.com Availability

    Percentage of time during which GitLab.com is fully operational and providing service to users within SLO parameters.

    URL(s)

    Health: Attention

    We’re above the SLO threshold, but we also know the data needs to be better

    Maturity: Level 2 of 3

    We have a good understanding of the metric and are currently collecting it via Pingdom, but we need to implement it as a proper SLO.

    Next Steps

    Infrastructure Cost per GitLab.com Monthly Active Users

    This metric reflects the dollar cost necessary to support one user in GitLab.com. It is an important metric because it allows us to estimate Infrastructure costs as our user base grows. Infrastructure cost comes from Netsuite; it is all expenses with the department name of `Infrastructure`, excluding account 6999 (Allocation). This cost is divided by [MAU](/handbook/product/metrics/#monthly-active-user-mau)

    URL(s)

    Health: Okay

    Met savings goals last quarter in working group.

    Maturity: Level 2 of 3

    We don’t have a good way to accurately track this cost yet.

    Next Steps

    GitLab.com Performance

    This metric needs to reflect the performance of GitLab as experienced by users. It should capture both frontend and backend performance. Even though the Infrastructure will be responsible for this metric they will need other departments such as Development, Quality, PM, and UX to positively affect change.

    URL(s)

    Health: Attention

    We are experiencing occasionally slowness of both the frontend and git operations.

    Maturity: Level 1 of 3

    We need better tooling, better data capture, and to wire up this metric to the prioritization of performance issues.

    Next Steps

    Infrastructure cost vs plan

    Tracks our actual infrastructure against our planned infrastructure costs for GitLab.com. We need this metric to manage our financial position. [Description of what this metric tells us, and why it’s important]

    URL(s)

    Health: Okay

    Met savings goals last quarter in working group.

    Maturity: Level 1 of 3

    We don’t have a good way to accurately track this cost yet.

    Next Steps

    MTTM (Mean-Time-To-Mitigation) for S1-S2-S3 security vulnerabilities

    The MTTM metric is an indicator of our efficiency in mitigating security vulnerabilities, whether they are reported through HackerOne bug bounty program (or other external means, such as security@gitlab.com emails) or internally-reported. The average days to close issues in the GitLab CE project (project_id = '13083') that are have the label `security` and S1, S2, or S3; this excludes issues with variation of the security label (e.g. `security review`) and S4 issues. Issues that are not yet closed are excluded from this analysis. This means historical data can change as older issues are closed and are introduced to the analysis. The average age in days threshold is set at the daily level.

    URL(s)

    Health: Okay

    Currently, our MTTM metrics show that we are effective.

    Maturity: Level 3 of 3

    We already have a Splunk instance that is ingesting all gitlab-com and gitlab-org issues and can visualize this data in dashboards. Currently, we are working with the Data Team to get this data into Periscope.

    Next Steps

    Regular Performance Indicators

    Engineering Pulse Survey

    A pulse survey is a brief, frequently sent survey that provides near-real-time information about satisfaction in time series. It usually consists of just 1-2 NPS-style questions like “How likely are you to recommend GitLab to your friends as a place to work?” , with an optional comment box for qualitative answers. This is unlike an annual, or bi-annual engagement survey that has much more content, but also has lower submission rates, and can only be sent infrequently.

    URL(s)

    Health: Okay

    We have data from Q1 showing engagement and have received positive feedback on the format.

    Maturity: Level 2 of 3

    We have an MVC with graphs automated into Periscope. The level of manual work is minimal but can still be improved. Need to expand beyond 2 small teams.

    Next Steps

    Non-headcount budget vs plan

    We need to spend our investors money wisely. We also need to run a responsible business to be successful, and to one day go on the public market.

    URL(s)

    Health: Unknown

    Currently finance tells me when there is a problem, I’m not self-service.

    Maturity: Level 2 of 3

    Right now it’s in a spreadsheet, and I get updates from Finance

    Next Steps

    Average Location Factor

    We remain efficient financially if we are hiring globally, working asynchronously, and hiring great people in low-cost regions where we pay market rates. We track an average location factor by function and department so managers can make tradeoffs and hire in an expensive region when they really need specific talent unavailable elsewhere, and offset it with great people who happen to be in low cost areas.

    URL(s)

    Health: Attention

    We are at our target of 0.58 exactly overall, but trending upward.

    Maturity: Level 3 of 3

    We have charts driven off of team.yml in the handbook. We have targets, but they are not visualized yet.

    Next Steps

    Diversity

    Diversity is one of our core values, and a general challenge for the tech industry. GitLab is in a privileged position to positively impact diversity in tech because our remote lifestyle should be more friendly to people who may have left the tech industry, or studied a technical field but never entered industry. This means we can add to the diversity of our industry, and not just play a zero-sum recruiting game with our competitors.

    URL(s)

    Health: Attention

    Engineering is now at the tech benchmark for gender diversity (~16%), but our potential is greater and we can do better. 20% should be our floor in technical roles. Other types of diversity are unknown.

    Maturity: Level 2 of 3

    The content is shared only in a closed metrics review, and does not have granularity. It’s not visualized, or in time series.

    Next Steps

    Handbook Update Frequency

    The handbook is essential to working remote successfully, to keeping up our transparency, and to recruiting successfully. Our processes are constantly evolving and we need a way to make sure the handbook is being updated at a regular cadence.

    URL(s)

    Health: Unknown

    Unknown. But my sense is we are not doing enough. For instance, we have not been able to fully update the handbook after the development department re-org (dev backend, and ops backend are still present. Although many of the new teams do have their own pages already)

    Maturity: Level 2 of 3

    We currently just have contribution graphs, which are a poor proxy for this.

    Next Steps

    Team Member Retention

    People are a priority and attrition comes at a great human cost to the individual and team. Additionally, recruiting (backfilling attrition) is a ludicrously expensive process, so we prefer to keep the people we have :)

    URL(s)

    Health: Okay

    I seem to recall our attrition is now below 10% which is great compared to the tech benchmark of 22% and the remote benchmark for 16%, but the fact that I can’t just look at a simple graph makes me nervous...

    Maturity: Level 2 of 3

    There is manually curated data in a spreadsheet from PO

    Next Steps

    Other PI Pages

    Legends

    Maturity

    Level Meaning
    Level 3 of 3 Measurable, time series, identified target for the metric, automated data extraction, dashboard in Periscope available to the whole company (if not the whole world)
    Level 2 of 3 About two-thirds done. E.g. Missing one of: automated data collection, defined threshold, or periscope dashboard.
    Level 1 of 3 About one-third done. E.g. Has one of: automated data collection, defined threshold, or periscope dashboard.
    Level 0 of 3 We only have an idea or a plan.

    Health

    Level Meaning
    Okay The KPI is at an acceptable level compared to the threshold
    Attention This is a blip, or we’re going to watch it, or we just need to enact a proven intervention
    Problem We'll prioritize our efforts here
    Unknown Unknown

    How to work with pages like this

    Data

    The heart of pages like this is a data file called /data/performance_indicators.yml which is in YAML format. Almost everything you need to do will involve edits to this file. Here are some tips:

    Pages

    Pages like /handbook/engineering/performance-indicators/ are rendered by and ERB template.

    These ERB templates call the helper function performance_indicators() that is defined in /helpers/custom_helpers.rb. This helper function calls in several partial templates to do it's work.

    This function takes a required argument named org in string format that limits the scope of the page to a portion of the data file. Possible valid values for this org argument are listed in the orgs property of each element in the array in /data/performance_indicators.yml.