For an understanding of what this team is going to be working on take a look at the product vision.
The Verify:Runner Group is focused on all the functionality with respect to Runner.
This team maps to Verify devops stage.
For an understanding of what our category direction is, take a look at the direction page.
On Trackwithin the week.
On Trackstatus in your message.
When an engineer is actively working (workflow of ~workflow::"In dev" or further right on current milestone) on an issue they will periodically leave status updates as top-level comments in the issue. The status comment should include the updated health status, any blockers, notes on what was done, if review has started, and anything else the engineer feels is beneficial. If there are multiple people working on it also include whether this is a front end or back end update. An update for each of MR associated with the issue should be included in the update comment. Engineers should also update the health status of the issue at this time.
This update need not adhere to a particular format. Some ideas for formats:
Health status: (On track|Needs attention|At risk) Notes: (Share what needs to be shared specially when the issue needs attention or is at risk)
Health status: (On track|Needs attention|At risk) What's left to be done: What's blocking: (probably empty when on track)
## Update <date> Health status: (On track|Needs attention|At risk) What's left to be done: #### MRs 1. !MyMR1 1. !MyMR2 1. !MyMR3
There are several benefits to this approach:
We measure the value we contribute by using Performance Indicators (PI), which we define and use to track progress. As the GitLab Runner is the engine that enables GitLab continuous integration (CI), it is essential to analyze monthly active user metrics for CI when evaluating R&D investment prioritization decisions for the Runner. We have also defined additional performance indicators specific to the Runner to evaluate R & D investment prioritization decisions.
As defined on the growth page, AARRR stands for Acquisition, Activation, Retention, Revenue, and Referral. This framework represents the customer journey, and the various means a product manager may apply a North Star metric (performance indicator) to drive a desired behavior in the funnel. As discussed above, since Runner is the engine that drives GitLab CI, at this time, we will leverage the same usage funnel definitions for Verify:Pipeline Execution.
When we launch the macOS Runners as a product, we will add a section here that defines the AARRR metrics for that offer. The rationale for that is that we are planning for a separate landing page for the macOS Runners. As such, we can measure page views and click-throughs to call to actions.
Example AARRR metrics that we have in mind for that offer:
The following people are permanent members of the Verify:Runner group:
|Adrien Kohlbecker||Senior Backend Engineer, Verify:Runner|
|Arran Walker||Senior Backend Engineer, Verify:Runner|
|Darren Eastman||Senior Product Manager, Verify:Runner|
|Elliot Rushton||Backend Engineering Manager, Verify:Runner|
|Georgi Georgiev||Senior Backend Engineer, Verify:Runner|
|Miguel Rincon||Senior Frontend Engineer, Verify:Runner|
|Pedro Pombeiro||Senior Backend Engineer, Verify:Runner|
|Romuald Atchadé||Backend Engineer, Verify:Runner|
|Tomasz Maczukin||Senior Backend Engineer, Verify:Runner|
|Zeff Morgan||Senior Software Engineer in Test, Verify:Runner|
To find out who our stable counterparts are, look at the runner product categtory
As a team we maintain several projects. The https://gitlab.com/gitlab-com/runner-maintainers group is added to each project with maintainer permission. We also try to align tools and versions used across them.
We spend a lot of time working in Go which is the language that GitLab Runner is written in. Familiarity with Docker and Kubernetes is also useful on our team.
Developers are responsible for creating merge requests that incrementally improve the product. Every developer should aspire to have their merge requests "sail through" the review process.
Each merge request created should include an explanation of why it is needed. Simply saying
Closes Issue #1234 is incomplete - it puts the work of understanding the context on the reviewer who then needs to go through the issue and put together the context they need to understand this particular change. Maintainers are encouraged to immediately send back MRs that don't make it clear why the change is needed.
When a merge request is sent to a reviewer or maintainer, it should be for the sake of performing a review that leads to a merge. If a merge request can't be merged at the end of the review it should not make it to a maintainer. For example, if it is blocked by another MR that needs to be merged before it or a yet-to-be-decided constant value then it cannot be merged and shouldn't be clogginging up the maintainers review queue.
As the author of a merge request, it is your responsibility to get your code merged. If it's not happening due to delays in review, it's up to you to reach out to the reviewer, consider offering to pair-review it on a call, or reassign it to a reviewer with more capacity.
When an MR is ready for review, or maintainer review, it is the author's responsibility to find a reviewer with capacity who agrees to take it on. All the active work of this is on the MR author as part of their responsibility for getting their code merged. Often times our maintainers, being the friendly team mates that they are, will actively solicit MRs that need revieweing - especially when approaching a release date - but this shouldn't be expected.
Reviewers and maintainers have a limit on how many MRs they can have assigned for review at any given time. If no one has capacity to review your MR consider doing anything possible to help address that situation. For example, offer to do a pair review with the reviewer to speed up a review. Otherwise you may need to wait for the backlog of reviews to be resolved by the reviewers before moving along with asking for more reviews. The current number of assigned reviews for each member of the team is viewable on the spreadsheet dashboard
Reviews, especially maintainer reviews, are the biggest bottleneck for our team delivering improvements to the product. Because of this we must be strict with expectations of incoming MRs and reasonable with the expecations we have on how much work the maintainers can handle. Because an issue in GitLab can have a one-to-many relationship with MRs, MRs - not issues - are our representation of work in flight. To help maintain reasonable amount of work on the shoulders of anyone doing reviews, we have a WIP limit of 5 MRs being reviewed by any person at any time. This can be tracked on the spreadsheet dashboard and will alert in the team slack channel when anyone crosses the WIP limit in either direction.
Because reviewing is a bottleneck for our team, it is important to be efficient. When reviewing an MR, consider high-level things before diving deeply into it - is the MR description adequate for you to understand the context without reading through 100's of issue comments? Does the general approach to the solution make architectural sense? Is it clear to you how to test it out and what behavioural difference you should see when the code is exercised?
If you as a reviewer or maintainer who has reached your limit of assigned review MRs, consider asking for assistance from your peers by reassigning some to them. Additionally consider pair-reviewing with the authors on a video call to speed up the review cycle - especially if you have multiple MRs to review from a single author.
Non-team member MRs count towards WIP limit. At GitLab anyone can contribute, and codebases do not equal "teams" or "groups" (even if they happen to share a name). Therefore we should, from time to time, anticipate the occasional MR from a non-team member. Since other teams may not be familiar with our imposed WIP limits, we will need to accommodate them as best we can and the reviewers may need to help with the re-balancing their workload. We should not accept these MRs as a valid reason to go above the WIP limits.
These limits are intended to help with the work load on the reviewers and maintainers. If you are feeling pressured to rush through reviews, talk to your EM. Quality is always more important than speed of review.
editoraccess to the
group-verifyproject in GCP
gitlab-com/runner-groupgroup on GitLab.com
team.ymlhas the new member as a reviewer of
Runner Group Daily Standupand
Runner Weekly Retro
Verify1password vault (requires creating an access request).
When a new developer joins Runner, their responsibility will include maintaining the runner project and all satelite repositories we own from their first day. This means that the developer will get Maintainer access to our repositories and will be added to the
runner-maintainers group so they appear in merge request approval group.
This allows the onboarding developer to grow organically over time in their responsibilities, which might include (non-exhaustive) code reviews, incident response, operations and releases. We should still follow the traditional two-stage review process for merges in most cases (incident response and operations being exceptions if the situation warrants it).
Although maintainer access is provided from day one for practical purposes, we follow the same process outlined here. Any engineeer inside of the organization is welcome to become a maintainer of a project owned by the Runner team. The first step would be to become a trainee maintainer.
To start the maintainer training process, please create an issue in the Runner project's issue tracker using the Release Maintainer Trainee template.
We try to use the Kanban process for planning/development. Here is the single source of truth about how we work. Any future changes should be reflected here. If you feel like that process is not ideal in certain aspects or doesn't achieve a goal, you are more than welcome to open a merge request and suggest a change.
We use the following board for planning, milestone tracking.
~workflow::start: This is the backlog for the product manager, they are responsible for this column. Issues should be vertically stacked, the top one has the highest priority. At this stage the issue is not very fleshed out, and we still need to understand the problem. When an issue is in this column it means that the product manager is aware of the issue and will start working on it soon. This column is limited to 10.
~workflow::design: At this point, issues that impact the GitLab UI have a clear problem and are going through the design process, where the UX Definition of Done (DoD) is applied to any issues. In order to progress to
~workflow::planning breakdown, the UX DoD must be complete.
~workflow::planning breakdown: This is where the engineering team
will work with the product manager to make sure we have a good
proposal for the issue at hand. Each engineer should spend 1-2 hours each
week to discuss the issue async with the community and product manager
to figure out a proposal. Particular attention should be given to thinking
about how the team can build the feature in an iterative way - what's the
smallest change we could make to get some version of the feature into the
hands of a customer in a single release?
Sometimes it's helpful to attach a milestone to the issue that is in this
column, so that an engineer can focus time doing PoC and research spikes for
that specific release.. It's perfectly OK for an issue to be closed from this
column if we decided to split the issue into multiple ones. When both the
product manager and engineering team are happy with the proposal they should
move the issue to the next column,
~workflow::ready for development. The
people assigned to this issue are most likely engineers and are responsible
for getting this issue to the next stage.
~workflow::ready for development: At this stage, we should have a good idea of what the issue requires, and how much work it is. When an engineer is out of tasks to work on and has no merge requests to review or any issues that are
~workflow::In devthey should pick the one on top, assign it to them and move it to the next column. In this column, each issue should have a milestone attached to it to indicate to the customer in which release it will be done in.
~workflow::In dev: Here is where the engineer starts working on the issue. If the issue is not clear on what needs to be achieved, they should discuss it with the team to see if it needs to go back to previous stages. This column is limited to 3 issues per engineer in the team. If they have more issues assigned to them we need to reevaluate the workload of the engineer because there is most likely a lot of context switching which is not effective.
~workflow::In review: When the issue has either 1 merge request or multiple merge requests ready for review it should be moved to the
~workflow::In reviewcolumn. There is a limit of 20 issues that should be in review. If we are at limit engineers should not pick up new work but see if they can help out in the review process.
~workflow::blocked: There are multiple reasons why an issue can be blocked. If the issue has a milestone attached to it, the issue blocking it should have the same milestone or 1 earlier, and have a higher priority. We should also mark the issue as blocked using the related issue feature.
~workflow::In review. If not pick a new issue from
~workflow::ready for development.
/labelquick action to keep the same priority stack.
As discussed above each issue inside of
development should have a milestone attached to it. If stack-ranked
properly, the issues on top should be in the current milestone, and then
the upcoming milestone. The product manager will take the top 5 issues
~workflow::ready for development column that is intended
for the upcoming milestone and use these issues as kickoff
Each week the EM and PM on the team get together and select 1-3 issues that are our next highest priority to be broken down. On a rough rotation, the EM will then assign those issues to individual engineers. Over the following 1-2 weeks the assigned engineers will be responsible for coming up with an implementation plan for the issue - exploring any unknowns, edge cases, or compatibility challenges that may come up and proposing how the issue can be broken down into smaller, more reasonable to handle tasks. During this time frame the other engineers on the team will also be asked to become familiar with the issue. On the team call that follows, the assigned engineers will lead a discussion about the considerations and challenges they uncovered and the implementation plan they proposed. After the discussion the assigned engineer will capture any notes from the conversation and the implementation plan in the issue itself, move the issue into
~workflow::ready for development, and unassign themselves from the issue so that it can be picked up in the priority based approach described by the Kanban workflow above.
In general, technical debt, backstage work, or other classifications of development work that don't directly contribute to a users experience with the runner are handled the same way as features or bugs and covered by the above Kanban style process. The one exception is that for each engineer on the team, they can only have 1 technical debt issue in flight at a time. This means that if they start working on a technical debt type issue they cannot start another one until the first one is merged. In the event that an engineer has more than one technical debt item in flight, they should choose which one to keep working on and move the others to the "in development" or "ready for review" columns depending on their status. The intent of this limitation is to constrain the number of technical debt issues that are in review at any given time to help ensure we always have most of our capacity available to review and iterate on features or bugs.
The team has a monthly retrospective meeting on the first Tuesday of the month. The agenda can be found here (internal link).
At GitLab, our release post policy specifies that deprecation notices need to be added to the release post at least two cycles before the release when the feature is removed or officially obsolete. There are typically several deprecations or removals that the runner team needs to manage across the main runner project and the other projects that this team maintains. As such, the runner development team uses the following process to manage deprecations and removals. This process should start no later than one month after the launch of a major release.
Each quarter we have an error budget of how many regression the release can cause.
Following the point system, each quarter we have we have 100 points. Each type of regression has a priority/severity, which every priority has a certain point associated to it, the higher the priority the more points:
Every beginning of the quarter the error budget is set to 100 again.
Each regression should have a retrospective item with corrective actions
to prevent any similar regression from happening again. These issues
should be labeled with
While Google's original paper advocates for stopping all feature work and focusing the team only on reliability improvements for the quarter, we have taken a more minimal first step: exceeding our error budget will only be a callout for our Product Manager to immediately prioritise corrective actions identified during the retrospective. Other feature work by team members not assigned to those corrective actions can proceed normally.
Below is the history of each quarter, and should be filled with the following template:
- Regressions: - Issue - Retrospective Issue - Corrective Action Issues - Error Budget: -X - Final Error budget: 30
This was inspired by the Google Error budget. It's used to balance reliability and new features.
In order to provide visibility into our review queue and priorities in line with our Community Contribution SLO we will use the ~"Review::P*" set of labels.
|Types of issues & users||Priority Label|
|All users contributing to refined issues||~"Review::P1"|
|Paying users from non-refined issues||~"Review::P2"|
|Non-paying users from non-refined issues||~"Review::P3"|
When a community contribution is abandoned by the original author, it's up to us to get it across the finish line, the list below outlines the process.
~"coach will finish"label to the merge request, and close the merge request.
~"Community contribution"label to it to make it clear that it is/was being worked on by a community member.
Issues worked on by the Runner group a group label of
~group::runner. Issues that contribute to the verify stage of the devops toolchain have the
Our code review process follows the general process where you choose a reviewer (usually not a maintainer) and then send it over to a maintainer for the final review.
The gitlab-runner codebase, which is the primary codebase that the runner group works in, follows a 7th-of-the-month feature freeze. This is documented with the codebase here and is important to be aware of as it drastically differs from most of the rest of the gitlab release timing.
See dedicated page.