The Verify:Runner Group is responsible for developing and maintaining GitLab Runner, features and capabilities for Runner SaaS, and delivering a lovable user experience for managing a build agent fleet at scale. Our mission is to provide market-leading features to simplify CI/CD workflows running anywhere and on any computing platform and support our thriving and active community contributors.
This team maps to Verify devops stage.
The vision and strategy for the runner product categories are covered on the following direction pages.
Our UX vision, more information around how UX and Development collaborate, and other UX-related information is documented in the UX Strategy page. Our Jobs to be Done are documented in Verify:Runner JTBD and provide a high-level view of the main objectives. Our User Stories are documented in Runner Group - User Stories which guide our solutions as we create design deliverables, and ultimately map back to JTBDs.
In the OPS section, we continuously define, measure, analyze, and iterate or Performance Indicators (PIs). One of the PI process goals is to ensure that, as a product team, we are focused on strategic and operational improvements to improve leading indicators, precursors of future success.
The following people are permanent members of the Verify:Runner group:
To find out who our stable counterparts are, look at the runner product categtory
(Sisense↗) We also track our backlog of issues, including past due security and infradev issues, and total open SUS-impacting issues and bugs.
(Sisense↗) MR Type labels help us report what we're working on to industry analysts in a way that's consistent across the engineering department. The dashboard below shows the trend of MR Types over time and a list of merged MRs.
As a team we maintain several projects. The https://gitlab.com/gitlab-com/runner-maintainers group is added to each project with maintainer permission. We also try to align tools and versions used across them.
We spend a lot of time working in Go which is the language that GitLab Runner is written in. Familiarity with Docker and Kubernetes is also useful on our team.
The GitLab Runner product development team maintains the following roadmap issue boards aligned to delivery milestones.
These roadmap views are the tactical implementation of the GitLab Runner Product Vision and Strategy. They depict the actual work that the runner development team is actively working on for the current milestone or planning to deliver in a near-term milestone.
Note Even though issues are assigned to a milestone as this is important for roadmap planning and general direction setting, it is only on completion of the iteration plan for a milestone will engineering commit delivery.
The Gitlab Runner team works in monthly iterations. While the feature freeze for the core runner project is on the 7th in preparation for the release on the 22nd of each month, iteration planning dates for the upcoming milestone are aligned with GitLab's product development timeline.
At a minimum, 14 days before the start of a milestone, the runner PM reviews and re-prioritizes as needed the features, bugs, technical debt, UX debt to be included for developer assignment in the iteration. Note - this is and must be a conversation with the PM, EM and members of the team.
The commitments for the iteration plan are directly related to the capacity of the team for the upcoming iteration. Therefore, to finalize the iteration plan for a milestone, we re-evaluate and consider the following sources of input:
~candidate::x.y
to each issue. For example ~candidate::15.0
~workflow::ready for development
label can be applied or the issue needs further refinement. The priority is the next in queue milestone.~workflow::refinement
label to the issue.~deliverable
label to all issues and removes the scoped label ~candidate::x.y
from the issue.Runner Category | Sub-category | Issue limit |
---|---|---|
Runner Core | Security Issues | 3 |
Runner Core | Bugs | 3 |
Runner Core | Tech Debt | 3 |
Runner Core | Features | 3 |
Runner SaaS | Features | 3 |
Runner Fleet | Features | 3 |
As a developer on the runner team, you will be contributing to the various runner projects. Since the GitLab Runner project reviewers and maintainers review all code contributions (runner team members and community contributions), we must try and be as efficient as possible when submitting merge requests for review.
Developers are responsible for creating merge requests that incrementally improve the product. Every developer should aspire to have their merge requests "sail through" the review process.
Each merge request created should include an explanation of why it is needed. Simply saying Closes Issue #1234
is incomplete - it puts the work of understanding the context on the reviewer who then needs to go through the issue and put together the context they need to understand this particular change. Maintainers are encouraged to immediately send back MRs that don't make it clear why the change is needed.
When a merge request is sent to a reviewer or maintainer, it should be for the sake of performing a review that leads to a merge. If a merge request can't be merged at the end of the review it should not make it to a maintainer. For example, if it is blocked by another MR that needs to be merged before it or a yet-to-be-decided constant value then it cannot be merged and shouldn't be clogginging up the maintainers review queue.
As the author of a merge request, it is your responsibility to get your code merged. If it's not happening due to delays in review, it's up to you to reach out to the reviewer, consider offering to pair-review it on a call, or reassign it to a reviewer with more capacity.
When an MR is ready for review, or maintainer review, it is the author's responsibility to find a reviewer with capacity who agrees to take it on. All the active work of this is on the MR author as part of their responsibility for getting their code merged. Often times our maintainers, being the friendly team mates that they are, will actively solicit MRs that need revieweing - especially when approaching a release date - but this shouldn't be expected.
Reviewers and maintainers have a limit on how many MRs they can have assigned for review at any given time. If no one has capacity to review your MR consider doing anything possible to help address that situation. For example, offer to do a pair review with the reviewer to speed up a review. Otherwise you may need to wait for the backlog of reviews to be resolved by the reviewers before moving along with asking for more reviews. The current number of assigned reviews for each member of the team is viewable on the spreadsheet dashboard
Reviews, especially maintainer reviews, are the biggest bottleneck for our team delivering improvements to the product. Because of this we must be strict with expectations of incoming MRs and reasonable with the expecations we have on how much work the maintainers can handle. Because an issue in GitLab can have a one-to-many relationship with MRs, MRs - not issues - are our representation of work in flight. To help maintain reasonable amount of work on the shoulders of anyone doing reviews, we have a WIP limit of 5 MRs being reviewed by any person at any time. This can be tracked on the spreadsheet dashboard and will alert in the team slack channel when anyone crosses the WIP limit in either direction.
Because reviewing is a bottleneck for our team, it is important to be efficient. When reviewing an MR, consider high-level things before diving deeply into it - is the MR description adequate for you to understand the context without reading through 100's of issue comments? Does the general approach to the solution make architectural sense? Is it clear to you how to test it out and what behavioural difference you should see when the code is exercised?
If you as a reviewer or maintainer who has reached your limit of assigned review MRs, consider asking for assistance from your peers by reassigning some to them. Additionally consider pair-reviewing with the authors on a video call to speed up the review cycle - especially if you have multiple MRs to review from a single author.
Non-team member MRs count towards WIP limit. At GitLab anyone can contribute, and codebases do not equal "teams" or "groups" (even if they happen to share a name). Therefore we should, from time to time, anticipate the occasional MR from a non-team member. Since other teams may not be familiar with our imposed WIP limits, we will need to accommodate them as best we can and the reviewers may need to help with the re-balancing their workload. We should not accept these MRs as a valid reason to go above the WIP limits.
These limits are intended to help with the work load on the reviewers and maintainers. If you are feeling pressured to rush through reviews, talk to your EM. Quality is always more important than speed of review.
editor
access to the group-verify
project in GCPmaintainer
to the gitlab-com/runner-group
group on GitLab.comteam.yml
has the new member as a reviewer of gitlab-org/gitlab-runner
and gitlab-org/ci-cd/custom-executor-drivers/autoscaler
Runner Group Daily Standup
and Runner Weekly Retro
Verify
1password vault (requires creating an access request).When a new developer joins Runner, their responsibility will include maintaining the runner project and all satelite repositories we own from their first day. This means that the developer will get Maintainer access to our repositories and will be added to the runner-maintainers
group so they appear in merge request approval group.
This allows the onboarding developer to grow organically over time in their responsibilities, which might include (non-exhaustive) code reviews, incident response, operations and releases. We should still follow the traditional two-stage review process for merges in most cases (incident response and operations being exceptions if the situation warrants it).
Although maintainer access is provided from day one for practical purposes, we follow the same process outlined here. Any engineeer inside of the organization is welcome to become a maintainer of a project owned by the Runner team. The first step would be to become a trainee maintainer.
To start the maintainer training process, please create an issue in the Runner project's issue tracker using the Release Maintainer Trainee template.
In general, technical debt, backstage work, or other classifications of development work that don't directly contribute to a users experience with the runner are handled the same way as features or bugs and covered by the above Kanban style process. The one exception is that for each engineer on the team, they can only have 1 technical debt issue in flight at a time. This means that if they start working on a technical debt type issue they cannot start another one until the first one is merged. In the event that an engineer has more than one technical debt item in flight, they should choose which one to keep working on and move the others to the "in development" or "ready for review" columns depending on their status. The intent of this limitation is to constrain the number of technical debt issues that are in review at any given time to help ensure we always have most of our capacity available to review and iterate on features or bugs.
The team has a monthly retrospective meeting on the first Tuesday of the month. The agenda can be found here (internal link).
At GitLab, our release post policy specifies that deprecation notices need to be added to the release post at least two cycles before the release when the feature is removed or officially obsolete. There are typically several deprecations or removals that the runner team needs to manage across the main runner project and the other projects that this team maintains. As such, the runner development team uses the following process to manage deprecations and removals. This process should start no later than one month after the launch of a major release.
On Track
within the week.
On Track
status.On Track
today.
On Track
status in your message.When an engineer is actively working (workflow of ~workflow::"In dev" or further right on current milestone) on an issue they will periodically leave status updates as top-level comments in the issue. The status comment should include the updated health status, any blockers, notes on what was done, if review has started, and anything else the engineer feels is beneficial. If there are multiple people working on it also include whether this is a front end or back end update. An update for each of MR associated with the issue should be included in the update comment. Engineers should also update the health status of the issue at this time.
This update need not adhere to a particular format. Some ideas for formats:
Health status: (On track|Needs attention|At risk)
Notes: (Share what needs to be shared specially when the issue needs attention or is at risk)
Health status: (On track|Needs attention|At risk)
What's left to be done:
What's blocking: (probably empty when on track)
## Update <date>
Health status: (On track|Needs attention|At risk)
What's left to be done:
#### MRs
1. !MyMR1
1. !MyMR2
1. !MyMR3
There are several benefits to this approach:
Some notes/suggestions:
Each quarter we have an error budget of how many regression the release can cause.
Following the point system, each quarter we have we have 100 points. Each type of regression has a priority/severity, which every priority has a certain point associated to it, the higher the priority the more points:
~priority::1
: 70~priority::2
: 30~priority::3
: 15~priority::4
: 5Every beginning of the quarter the error budget is set to 100 again.
Each regression should have a retrospective item with corrective actions
to prevent any similar regression from happening again. These issues
should be labeled with ~"corrective action"
.
While Google's original paper advocates for stopping all feature work and focusing the team only on reliability improvements for the quarter, we have taken a more minimal first step: exceeding our error budget will only be a callout for our Product Manager to immediately prioritise corrective actions identified during the retrospective. Other feature work by team members not assigned to those corrective actions can proceed normally.
Below is the history of each quarter, and should be filled with the following template:
- Regressions:
- Issue
- Retrospective Issue
- Corrective Action Issues
- Error Budget: -X
- Final Error budget: 30
This was inspired by the Google Error budget. It's used to balance reliability and new features.
In order to provide visibility into our review queue and priorities in line with our Community Contribution SLO we will use the ~"Review::P*" set of labels.
Types of issues & users | Priority Label |
---|---|
All users contributing to refined issues | ~"Review::P1" |
Paying users from non-refined issues | ~"Review::P2" |
Non-paying users from non-refined issues | ~"Review::P3" |
When a community contribution is abandoned by the original author, it's up to us to get it across the finish line, the list below outlines the process.
~"coach will finish"
label to the merge request, and close
the merge request.~"Community contribution"
label to it to make it clear that
it is/was being worked on by a community member.Issues worked on by the Runner group a group label of ~group::runner
. Issues that contribute to the verify stage of the devops toolchain have the ~devops::verify
label.
GitLab.com: @gitlab-com/runner-group
Slack: #g_runner
Our code review process follows the general process where you choose a reviewer (usually not a maintainer) and then send it over to a maintainer for the final review.
Current maintainers are members of the runner-maintainers
group.
Current reviewers are members of the runner-group
group.
The gitlab-runner codebase, which is the primary codebase that the runner group works in, follows a 7th-of-the-month feature freeze. This is documented with the codebase here and is important to be aware of as it drastically differs from most of the rest of the gitlab release timing.
As part of the pre-sales and post-sales engagement, your customer may have in-depth questions regarding topics such as GitLab Runner configuration, autoscaling options, how concurrency works, distributing the CI jobs workload, monitoring runners, and so on. The goal of the process below is to enable the runner team to be as efficient as possible in providing the level of support that our sales team and customers require.
See dedicated page.