This document explains the workflow for anyone working with issues in GitLab Inc. For the workflow that applies to everyone please see PROCESS.md.
Products at GitLab are built using the GitLab Flow.
We have specific rules around code review.
If you notice that pipelines for the
master branch of GitLab or GitLab FOSS is failing (red) or broken (green as a false positive), returning the build to a passing state takes priority over everything else development related, since everything we do while tests are broken may break existing functionality, or introduce new bugs and security issues.
When broken master is detected, an issue should be created and labelled as
~"master:broken" with the highest priority and severity
~S1 following the process described below for developers or maintainers
Once the issue is created the top priority is to get master back to a passing state. This can be done in one of three ways:
~S1issue, and should be completed as soon as possible. In this scenario, when changing test behavior it is important to be diligent to ensure the intended behavior of the feature is preserved.
Issues labeled with ~"master:broken" must be triaged quickly and mitigated quickly in order to unblock the rest of the team:
masterbuild was failing and the underlying problem was quarantined / reverted / temporary workaround created but the root cause still needs to be discovered: apply a label ~"master:needs-investigation" and remove ~"master:broken"
masterbuild had a flaky failure that cannot be reliably reproduced: apply ~"master:flaky" and remove ~"master:broken" label
Everyone who is a member of the project should triage the issues with the ~"master:broken" label.
All tests (unit, integration, and E2E QA) that fail on master are treated as
Any test failures or flakiness (either false positive or false negative) causes productivity impediments for all of engineering and our release processes.
If a change causes new test failures, the fix to the test should be made in the same Merge Request.
If the change causes new QA test failures, in addition to fixing the QA tests, the
review-qa-all job must be run to validate the fix before the Merge Request can be merged.
The cost to fix test failures increases exponentially as time passes. Our aim should be to keep
master free from failures, not to fix
master only after it breaks.
Failing pipelines on
master are reported to the
#development channel in
Slack, and this is how we normally notice them. Since investigating them is a
high-priority task, multiple people look out for this!
To avoid both the bystander effect and duplication of effort, it's useful to
report actions you're taking in the
#development channel, as you're taking
them. For brevity, communicate by applying emoji to pipeline failure messages:
:boom:emoji, create the issue, and reply to the message with a link to the issue.
This scheme makes it obvious to see, at a glance, whether someone is already looking at a pipeline failure on master. In turn, this encourages people to jump in and help, while helping to avoid duplication of effort.
#developmentso that other developers are aware of the problem and can help
#developmentby mentioning the issue
#frontendif someone can assign the issue to themselves
@mentionthe relevant Engineering Managers in the issue and on Slack, so that resources can be assigned to fix it as quickly as possible.
master is red, we need to try hard to avoid introducing new failures,
since it's easy to lose confidence if it stays red for a long time. However,
it's wasteful and impractical to completely stop development. We need to
compromise between the two priorities.
It's ok to merge a merge request with a failing pipeline if the following conditions are met:
Before merging, add a comment mentioning that the failure happens in
and post a reference to the issue. For instance:
Failure in <JOB_URL> happens in `master` and is being worked on in #XYZ, merging.
Whether the pipeline is failing or not, if
master is red, check how far behind
the source branch is. If it's more than 100 commits behind, ask for it to be
brought up to date before merging. This reduces the chance of introducing new
failures, and also acts to slow (but not stop) the rate of change in
helping us to make it green again.
Security issues are managed and prioritized by the security team. If you are assigned to work on a security issue in a milestone, you need to follow these steps:
masterhas been reviewed and is ready to merge.
If you find a security issue in GitLab, create a confidential issue mentioning the relevant security and engineering managers, and post about it in
If you accidentally push security commits to GitLab.com, we recommend that you:
#releases. It may be possible to execute a garbage collection (via the Housekeeping task in the repository settings) to remove the commits.
For more information on how the entire process works for security releases, see the documentation on security releases.
workflow::In devlabel to the issue.
Closes #issue_idfrom the MR description, to prevent auto closing of the issue after merging.
workflow::In review. If multiple people are working on the issue or multiple workflow labels might apply, consider breaking the issue up. Otherwise, default to the workflow label farthest away from completion.
workflow::verification, to indicate all the development work for the issue has been done and it is waiting to be deployed and verified.
For larger issues or issues that contain many different moving parts, you'll be likely working in a team. This team will typically consist of a backend engineer, a frontend engineer, a Product Designer and a product manager.
Avoid adding configuration values in the application settings or in
gitlab.yml. Only add configuration if it is absolutely necessary. If you
find yourself adding parameters to tune specific features, stop and consider
how this can be avoided. Are the values really necessary? Could constants be
used that work across the board? Could values be determined automatically?
See Convention over Configuration for more discussion.
Start working on things with the highest priority in the current milestone. The priority of items are defined under labels in the repository, but you are able to sort by priority.
After sorting by priority, choose something that you’re able to tackle and falls under your responsibility. That means that if you’re a frontend developer, you work on something with the label
To filter very precisely, you could filter all issues for:
Use this link to quickly set the above parameters. You'll still need to filter by the label for your own team.
If you’re in doubt about what to work on, ask your lead. They will be able to tell you.
It's every developers' responsibilities to triage and review code contributed by the rest of the community, and work with them to get it ready for production.
Merge requests from the rest of the community should be labeled with the
Community Contribution label.
When evaluating a merge request from the community, please ensure that a relevant PM is aware of the pending MR by mentioning them.
This should be to be part of your daily routine. For instance, every morning you could triage new merge requests from the rest of the community that are not yet labeled
Community Contribution and either review them or ask a relevant person to review it.
Make sure to follow our Code Review Guidelines.
Labels are described in our Contribution guide.
GitLab.com is a very large instance of GitLab Enterprise Edition. It runs release candidates for new releases, and sees a lot of issues because of the amount of traffic it gets. There are several internal tools available for developers at GitLab to get data about what's happening in the production system:
If you've built feature flags into your code, be sure to read about how to use the feature flag to test a feature on GitLab.com.
GitLab Inc has to be selective in working on particular issues. We have a limited capacity to work on new things. Therefore, we have to schedule issues carefully.
Product Managers are responsible for scheduling all issues in their respective product areas, including features, bugs, and tech debt. Product managers alone determine the prioritization, but others are encouraged to influence the PMs decisions. The UX Lead and Engineering Leads are responsible for allocating people making sure things are done on time. Product Managers are not responsible for these activities, they are not project managers.
Direction issues are the big, prioritized new features for each release. They are limited to a small number per release so that we have plenty of capacity to work on other important issues, bug fixes, etc.
If you want to schedule an
Accepting merge requests issue, please remove the label first.
Any scheduled issue should have a team label assigned, and at least one type label.
To request scheduling an issue, ask the responsible product manager
We have many more requests for great features than we have capacity to work on.
There is a good chance we’ll not be able to work on something.
Make sure the appropriate labels (such as
customer) are applied so every issue is given the priority it deserves.
Teams (Product, UX, Engineering) continually work on issues according to their respective workflows.
There is no specified process whereby a particular person should be working on a set of issues in a given time period.
However, there are specific deadlines that should inform team workflows and prioritization.
Suppose we are talking about milestone
m that will be shipped in month
M (on the 22nd).
We have the following deadlines:
M-1, 4th(at least 14 days before milestone
M-1, 13th(at least 5 days before milestone
M-1, 17th(at least 1 day before milestone
M-1, 18th(or next business day, milestone
mbegins): Kick off! 📣
missues with docs have been merged into master.
mrelease. See feature flags.
mrelease. See auto deploy transition.
missues are de-scoped from
mbeing removed from them. These issues should be considered for the next release.
mis marked as closed (see Milestone Cleanup).
M, 22nd: Release Day 🚀
M, 23rd(the day after the release):
mstarts. This includes regular and security patch releases.
missues and merge requests are automatically moved to milestone
m+1, with the exception of
M+1, 4th(within a few weeks of milestone
Refer to release post due dates for additional deadlines.
Note that deployments to GitLab.com are more frequent than monthly major/minor releases on the 22nd. See auto deploy transition guidance for details.
Team members use labels to track issues throughout development. This gives visibility to other developers, product managers, and designers, so that they can adjust their plans during a monthly iteration. An issue should follow these stages:
workflow::In dev: A developer indicates they are developing an issue by applying the
workflow::In review: A developer indicates the issue is in code review and UX review by removing the
In devlabel, and applying the
workflow::verification: A developer indicates that all the development work for the issue has been done and is waiting to be deployed and verified.
When the issue has been verfied and everything is working, it can be closed.
At the beginning of each release, we have a kickoff meeting, publicly livestreamed to YouTube. In the call, the Product Development team (PMs, Product Designers, and Engineers) communicate with the rest of the organization which issues are in scope for the upcoming release. The call is structured by product area with each PM leading their part of the call.
The notes are available in a publicly-accessible Google doc. Refer to the doc for details on viewing the livestream.
After each release, we have a retrospective meeting, publicly livestreamed to YouTube. We discuss what went well, what went wrong, and what we can improve for the next release.
The format for the retrospective is as follows. The notes for the retrospective are kept in a publicly-accessible Google doc. In order to keep the call on time and to make sure we leave ample room to discuss how we can improve, the moderator may move the meeting forward with the timing indicated:
The purpose of the retrospective is to help Engineering at GitLab learn and improve as much as possible from every monthly release. In line with our value of transparency, we livestream the meeting to YouTube and monitor chat for questions from viewers. Please check the retrospective notes for details on joining the livestream.
At the end of each retrospective the Engineering Productivity team is responsible for triaging improvement items identified from the retrospective. This is needed for a single owner to be aware of the bigger picture technical debt and backstage work. The actual work can be assigned out to other teams or engineers to execute.
Engineering Managers are responsible for capacity planning and scheduling for their respective teams with guidance from their counterpart Product Managers.
To ensure hygiene across Engineering, we will close out a milestone when it has expired.
When a milestone is closed, automatic grooming of unfinished work (open issues and merge requests) associated with the expired milestone will be moved to the next milestone.
Incomplete issues with
~Deliverable are labelled with
This is currently implemented as part of our automated triage operations. Additionally, issues with the
~Deliverable label which have a milestone beyond current +1, will have the
~Deliverable label removed.
The milestone cleanup is currently applied to the following projects:
Before the meeting starts, remind people who plan to speak to join the Google Hangout earlier, since there is a 50 user limit.
Several minutes before the scheduled meeting time, follow the livestreaming instructions to start a Google Hangout using the
Now setting. Paste the Google Hangout invite link in the Google doc.
At the scheduled meeting time, start broadcasting live to YouTube. Begin the meeting.
When working in GitLab (and in particular, the GitLab.org group), use group labels and group milestones as much as you can. It is easier to plan issues and merge requests at the group level, and exposes ideas across projects more naturally. If you have a project label, you can promote it to a group milestone. This will merge all project labels with the same name into the one group label. The same is true for promoting group milestones.
We definitely don't want our technical debt to grow faster than our code base. To prevent this from happening we should consider not only the impact of the technical debt but also a contagion. How big and how fast is this problem going to be over time? Is it likely a bad piece of code will be copy-pasted for a future feature? In the end, the amount of resources available is always less than amount of technical debt to address.
To help with prioritization and decision-making process here, we recommend thinking about contagion as an interest rate of the technical debt. There is a great comment from the internet about it:
You wouldn't pay off your $50k student loan before first paying off your $5k credit card and it's because of the high interest rate. The best debt to pay off first is one that has the highest loan payment to recurring payment reduction ratio, i.e. the one that reduces your overall debt payments the most, and that is usually the loan with the highest interest rate.
Security is our top priority. Our Security Team is raising the bar on security every day to protect users' data and make GitLab a safe place for everyone to contribute. There are many lines of code, and Security Teams need to scale. That means shifting security left in the Software Development LifeCycle (SDLC). Being able to start the security review process earlier in the software development lifecycle means we will catch vulnerabilities earlier, and mitigate identified vulnerabilities before the code is merged. We are fixing the obvious security issues before every merge, and therefore, scaling the security review process. Our workflow includes a check and validation by the reviewers of every merge request, thereby enabling developers to act on identified vulnerabilities before merging. As part of that process, developers are also empowered to reach out to the Security Team to discuss the issue at that stage, rather than later on, when mitigating vulnerabilities becomes more expensive. After all, security is everyone's job. See also our Security Paradigm
From time to time, there are occasions that engineering team must act quickly in response to urgent issues. This section describes how the engineering team handles certain kinds of such issues.
Not everything is urgent. See below for a non-exclusive list of things that are in-scope and not in-scope. As always, use your experience and judgment, and communicate with others.
To timely address high impact availability and performance issues, a weekly grooming session is held by the Infrastructure, Development, and QE teams jointly to triage issues for prioritization and planning with the Product team.
There are two issue boards being reviewed in this grooming exercise.
Milestoneor the label
workflow::ready for developmentis missing.
Milestoneand the label
workflow::ready for development.