By 2025, our vision for GitLab Runner is that the runner's setup and day-to-day operations at scale be an almost zero-friction experience.
Our mission is to enable organizations to efficiently run GitLab CI/CD jobs on any computing platform and do so in an operationally efficient and highly secure way at any scale.
This team maps to Verify devops stage.
The product strategy and roadmap for the runner product categories are covered on the following direction pages.
Our UX vision, more information around how UX and Development collaborate, and other UX-related information will be documented in the UX Strategy page. Our Jobs to be Done are documented in Verify:Runner JTBD and provide a high-level view of the main objectives. Our User Stories are documented in Runner Group - User Stories which guide our solutions as we create design deliverables, and ultimately map back to JTBDs.
In the OPS section, we continuously define, measure, analyze, and iterate or Performance Indicators (PIs). One of the PI process goals is to ensure that, as a product team, we are focused on strategic and operational improvements to improve leading indicators, precursors of future success.
The following people are permanent members of the Verify:Runner group:
Person | Role |
---|---|
Darren Eastman | Senior Product Manager, Verify:Runner |
For a more comprehensive list of counterparts, look at the runner product categtory
(Sisense↗) We also track our backlog of issues, including past due security and infradev issues, and total open System Usability Scale (SUS) impacting issues and bugs.
(Sisense↗) MR Type labels help us report what we're working on to industry analysts in a way that's consistent across the engineering department. The dashboard below shows the trend of MR Types over time and a list of merged MRs.
(Sisense↗) Flaky test are problematic for many reasons.
(Sisense↗) Slow tests are impacting the GitLab pipeline duration.
As a team we maintain several projects. The https://gitlab.com/gitlab-com/runner-maintainers group is added to each project with maintainer permission. We also try to align tools and versions used across them.
ci*
and r-saas-*
folders)runner-mananger*
roles)The following projects depend on the public Runner APIs, and should be taken into consideration in the scope of any changes/deprecations to the public API surface:
Project | API |
---|---|
GitLab Terraform Provider | REST API |
GitLab CLI | REST API |
We spend a lot of time working in Go which is the language that GitLab Runner is written in. We also contribute to the main GitLab app, working in Rails and Vue.js. Familiarity with Docker and Kubernetes is also useful on our team.
We work in monthly iterations. Iteration planning dates for the upcoming milestone are aligned with GitLab's product development timeline.
At a minimum, 30 days before the start of a milestone, the runner PM reviews and re-prioritizes as needed the features to be included in the iteration planning issue. The planning issue is a tool for asynchoronous collaboration between the PM, EM and members of the team. We use cross-functional prioritization to guide the collaboration process.
The commitments for the iteration plan are directly related to the capacity of the team for the upcoming iteration. Therefore, to finalize the iteration plan (resource allocation) for a milestone, we evaluate and consider the following:
Runner::P1
)~candidate::x.y
to each issue. For example ~candidate::16.0
deliverable
label to issues based on team capacity. The deliverable label signals a commitment for delivery and is tied directly to our team KPIs. Any issue not receiving the deliverable
label will be treated as stretch and pulled in as team members have capacity.As we have a lot of involvement with our stable counterparts and reliability team, we also add a section to our iteration plan to reflect any blocking
or relating
issues.
blocking
or related
reliability issues to the iteration plan.To indicate priority of issues during an iteration we may use labels ~"Runner::P1" ~"Runner::P2" ~"Runner::P3"
.
At a minimum we will always identify our top priorities using ~"Runner::P1"
.
~"Runner::P1"
means "elevated priority". We aim to deliver all or most of these issues.~"Runner::P2"
means "normal priority".~"Runner::P3"
means "reduced priority".~"Runner::P*"
labels can and should differ from ~priority:*
labels.
~priority:*
labels imply the timeline for when issues will be addressed.
While ~"Runner::P*"
indicate priority for the scheduled iteration.We follow the product development flow. Our team uses one issue as SSOT for design, backend, and frontend work.
Once a problem is validated, the issue enters the design phase where the product designer collaborates with the team to ideate solutions and explore different approaches before converging on a single solution that is feasible and has requirements meet the business goals.
Sometimes we need to increase our confidence that the proposed solution meets the user's needs and expectations. This confidence can be obtained from additional research during the solution validation phase.
Following the design and validation phases, the problem should already be broken down into the quickest change possible to improve the user's outcome and be ready for a more detailed review by engineering before moving to the build track.
Once the PM intends to prioritize the issue for the next milestone, the ~"workflow::planning breakdown"
label is applied and the EM will assign a developer to further break down and apply weights to that work so that the issue can be ~"workflow::ready for development"
.
At the end of the iteration we release Runner and associated projects. The release process is documented here.
As a developer on the runner team, you will be contributing to the various runner projects. Since the GitLab Runner project reviewers and maintainers review all code contributions (runner team members and community contributions), we must try and be as efficient as possible when submitting merge requests for review.
We follow the merge request author responsibility guidelines.
We follow the code review guidelines.
To help authors find a reviewer with capacity to take on a review, we have a spreadsheet dashboard that shows the number of MRs any of the backend members of the Verify:Runner or Verify:Runner SaaS groups have assigned.
If you as a reviewer or maintainer who has reached your limit of assigned review MRs, consider asking for assistance from your peers by reassigning some to them. Additionally consider pair-reviewing with the authors on a video call to speed up the review cycle - especially if you have multiple MRs to review from a single author.
Non-team member MRs count towards WIP limit. At GitLab anyone can contribute, and codebases do not equal "teams" or "groups" (even if they happen to share a name). Therefore we should, from time to time, anticipate the occasional MR from a non-team member. Since other teams may not be familiar with our imposed WIP limits, we will need to accommodate them as best we can and the reviewers may need to help with the re-balancing their workload. We should not accept these MRs as a valid reason to go above the WIP limits.
These limits are intended to help with the work load on the reviewers and maintainers. If you are feeling pressured to rush through reviews, talk to your EM. Quality is always more important than speed of review.
editor
access to the group-verify
project in GCPmaintainer
to the gitlab-com/runner-group
group on GitLab.comteam.yml
has the new member as a reviewer of gitlab-org/gitlab-runner
and gitlab-org/ci-cd/custom-executor-drivers/autoscaler
Verify
1password vault (requires creating an access request).When a new developer joins Runner, their responsibility will include maintaining the runner project and all satelite repositories we own from their first day. This means that the developer will get Maintainer access to our repositories and will be added to the runner-maintainers
group so they appear in merge request approval group.
This allows the onboarding developer to grow organically over time in their responsibilities, which might include (non-exhaustive) code reviews, incident response, operations and releases. We should still follow the traditional two-stage review process for merges in most cases (incident response and operations being exceptions if the situation warrants it).
Although maintainer access is provided from day one for practical purposes, we follow the same process outlined here. Any engineeer inside of the organization is welcome to become a maintainer of a project owned by the Runner team.
In general, technical debt, backstage work, or other classifications of development work that don't directly contribute to a users experience with the runner are handled the same way as features or bugs and covered by the above Kanban style process. The one exception is that for each engineer on the team, they can only have 1 technical debt issue in flight at a time. This means that if they start working on a technical debt type issue they cannot start another one until the first one is merged. In the event that an engineer has more than one technical debt item in flight, they should choose which one to keep working on and move the others to the "in development" or "ready for review" columns depending on their status. The intent of this limitation is to constrain the number of technical debt issues that are in review at any given time to help ensure we always have most of our capacity available to review and iterate on features or bugs.
The team has a monthly retrospective meeting on the first Tuesday of the month. The agenda can be found here (internal link).
At GitLab, our release post policy specifies that deprecation notices need to be added to the release post at least two cycles before the release when the feature is removed or officially obsolete. There are typically several deprecations or removals that the runner team needs to manage across the main runner project and the other projects that this team maintains. As such, the runner development team uses the following process to manage deprecations and removals. This process should start no later than one month after the launch of a major release.
On Track
within the week.
On Track
status.On Track
today.
On Track
status in your message.When an engineer is actively working (workflow of ~workflow::"In dev" or further right on current milestone) on an issue they will periodically leave status updates as top-level comments in the issue. The status comment should include the updated health status, any blockers, notes on what was done, if review has started, and anything else the engineer feels is beneficial. If there are multiple people working on it also include whether this is a front end or back end update. An update for each of MR associated with the issue should be included in the update comment. Engineers should also update the health status of the issue at this time.
This update need not adhere to a particular format. Some ideas for formats:
Health status: (On track|Needs attention|At risk)
Notes: (Share what needs to be shared specially when the issue needs attention or is at risk)
Health status: (On track|Needs attention|At risk)
What's left to be done:
What's blocking: (probably empty when on track)
## Update <date>
Health status: (On track|Needs attention|At risk)
What's left to be done:
#### MRs
1. !MyMR1
1. !MyMR2
1. !MyMR3
There are several benefits to this approach:
Some notes/suggestions:
Issues worked on by the Runner group a group label of ~group::runner
. Issues that contribute to the verify stage of the devops toolchain have the ~devops::verify
label.
GitLab.com: @gitlab-com/runner-group
Slack: #g_runner
Our code review process follows the general process where you choose a reviewer (usually not a maintainer) and then send it over to a maintainer for the final review.
Current maintainers are members of the runner-maintainers
group.
Current reviewers are members of the runner-group
group.
As part of the pre-sales and post-sales engagement, your customer may have in-depth questions regarding topics such as GitLab Runner configuration, autoscaling options, how concurrency works, distributing the CI jobs workload, monitoring runners, and so on. The goal of the process below is to enable the runner team to be as efficient as possible in providing the level of support that our sales team and customers require.
See dedicated page.