This document explains the workflow for anyone working with issues in GitLab Inc. For the workflow that applies to the wider community see the contributing guide.
master
Products at GitLab are built using the GitLab Flow.
We have specific rules around code review.
In line with our values of short toes, making two-way-door decisions and bias for action, anyone can propose to revert a merge request. When deciding whether an MR should be reverted, the following should be true:
~severity::1
or ~severity::2
.
See severity labelsReverting merge requests that add non-functional changes and don't remove any existing capabilities should be avoided in order to prevent designing by committee.
The intent of a revert is never to place blame on the original author. Additionally, it is helpful to inform the original author so they can participate as a DRI on any necessary follow up actions.
The pipeline:expedite
label, and master:broken
or master:foss-broken
label must be set on merge requests that fix master
to skip some non-essential jobs in order to speed up the MR pipelines.
master
If you notice that pipelines for the master
branch of GitLab or GitLab FOSS are failing, returning the build to a passing state takes priority over everything else development related, since everything we do while tests are broken may:
master
?A broken master is an event where a pipeline in master
is failing.
The cost to fix test failures increases exponentially as time passes due to merged results pipelines used. Auto-deploys, as well as monthly releases and security releases, depend on gitlab-org/gitlab
master being green for tagging and merging of backports.
Our aim should be to keep master
free from failures, not to fix master
only after it breaks.
master
service level objectivesThere are two phases for fixing a broken master
incident which have a target SLO to clarify the urgency. The resolution phase is dependent on the completion of the triage phase.
Phase | Service level objective | DRI |
---|---|---|
Triage | 4 hours from the initial broken master incident creation until assignment |
Engineering Productivity team |
Resolution | 4 hours from assignment to DRI until incident is resolved | Merge request author or team of merge request author or dev on-call engineer |
Additional details about the phases are listed below.
master
escalationIf a broken master
is blocking your team (such as creating a security release) then you should:
master
incident with a DRI assigned and check discussions there.#master-broken
channel. If there isn't a discussion, ask in #master-broken
if there's anyone investigating the incident you are looking atmaster
incident.The Engineering Productivity team is the triage DRI for monitoring, identification, and communication of broken master
incidents.
master
branch.#master-broken
and will be reviewed by the team.If the incident is a duplicate of an existing incident, use the following quick actions to close the duplicate incident:
/assign me
/duplicate #<original_issue_id>
/copy_metadata #<original_issue_id>
/assign me
Acknowledged
(in the right-side menu).:ack:
emoji reaction should be applied by the triage DRI to signal the linked incident status has been changed to Acknowledged
and the incident is actively being triaged.master
incidents for the same failure. If the broken master
is related to a test failure, search the spec file in the issue search to see if there's a known failure::flaky-test
issue.#development
, #backend
, and #frontend
using the Slack Workflow.
#master-broken
channel and select Broadcast Master Broken
, then click Continue the broadcast
.#releases
channel and discuss whether it's appropriate to create another migration to roll back the first migration or turn the migration into a no-op by following Disabling a data migration steps.master
fails for a flaky reason, and it cannot be reliably reproduced (i.e. running the failing spec locally or retry the failing job):
New issue
button in top-right of the failing job page (that will automatically add a link to the job in the issue), and apply the Broken Master - Flaky
description template.Add the appropriate labels to the main incident:
# Add those labels
/label ~"master-broken::flaky-test"
/label ~"failure::flaky-test"
# Pick one of those labels
/label ~"flaky-test::dataset-specific"
/label ~"flaky-test::datetime-sensitive"
/label ~"flaky-test::ordering assertion"
/label ~"flaky-test::random input"
/label ~"flaky-test::transient bug"
/label ~"flaky-test::unclean environment"
/label ~"flaky-test::unreliable dom selector"
/label ~"flaky-test::unstable infrastructure"
artifacts/tmp/capybara
to the incident if one is available.geo
spec file is failing, specifically the shard
spec, search for those keywords in the commit history).
Merge branch
text to only see merge commits.History
or Blame
button at the top of a file in the file explorer, e.g. at https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/backup.rb.#development
Slack channel.
#development
Slack channel.Please set the appropriate ~master-broken:*
label from the list below:
/label ~"master-broken::caching"
/label ~"master-broken::ci-config"
/label ~"master-broken::dependency-upgrade"
/label ~"master-broken::flaky-test"
/label ~"master-broken::fork-repo-test-gap"
/label ~"master-broken::pipeline-skipped-before-merge"
/label ~"master-broken::test-selection-gap"
/label ~"master-broken::need-merge-train"
/label ~"master-broken::infrastructure"
/label ~"master-broken::runner-disk-full"
/label ~"master-broken::gitaly"
/label ~"master-broken::external-dependency-unavailable"
/label ~"master-broken::failed-to-pull-image"
/label ~"master-broken::gitlab-com-overloaded"
/label ~"master-broken::undetermined"
@username FYI
message.
Additionally, a message can be posted in #backend_maintainers
or #frontend_maintainers
to get a maintainer take a look at the fix ASAP.The merge request author of the change that broke master
is the resolution DRI.
In the event the merge request author is not available, the team of the merge request author will assume the resolution DRI responsibilities.
If a DRI has not acknowledged or signaled working on a fix, any developer can take assume the resolution DRI responsibilities by assigning themselves to the incident.
master
incidents over new bug/feature work. Resolution options include:
master
. If a revert is performed,
create an issue to reinstate the merge request and assign it to the author
of the reverted merge request.
pipeline:expedite
label, and master:broken
or master:foss-broken
label must be set on merge requests that fix master
to skip some non-essential jobs in order to speed up the MR pipelines.quarantined test
label to the failure::flaky-test
issue you previously created during the identification phase.priority::1
severity::1
issue.
Pick into auto-deploy
label (along with the needed severity::1
and priority::1
) to make sure deployments are unblocked.master
incident affects any stable branches (e.g. https://gitlab.com/gitlab-org/gitlab/-/merge_requests/25274) or is caused by a flaky failure,
open new merge requests directly against the active stable branches and ping the current release manager in the merge requests to avoid
delays in releases / security releases.
See How to fix a broken stable branch guide for more details.#master-broken
when the fix was merged#master-broken
channel and select Broadcast Master Fixed
, then click Continue the broadcast
.master
build was failing and the underlying problem was quarantined /
reverted / temporary workaround created but the root cause still needs to be
discovered, the investigation should continue directly in the incident.master
incident could have been prevented in the Merge Request pipeline.Once the resolution DRI announces that master
is fixed:
master
has been fixed since we use merged results pipelines.Merge requests can not be merged to master
until the incident status is changed to Resolved
.
This is because we need to try hard to avoid introducing new failures, since it's easy to lose confidence if it stays red for a long time.
In the rare case where a merge request is urgent
and must be merged immediately, team members can follow the process below to have a merge
request merged during a broken master
.
Merging while master
is broken can only be done for:
master
issues (we can have multiple broken master
issues ongoing).master
First, ensure the latest pipeline has completed less than 2 hours ago (although it is likely to have have failed due to
gitlab-org/gitlab
using
merged results pipelines).
Next, make a request on Slack:
#frontend_maintainers
or #backend_maintainers
Slack
channels (whichever one is more relevant).master
, optionally add a link to this
page in your request.A maintainer who sees a request to merge during a broken master
must follow this process.
Note, if any part of the process below disqualifies a merge request from being merged
during a broken master
then the maintainer must inform the requestor as to why in the
merge request (and optionally in the Slack thread of the request).
First, assess the request:
:eyes:
emoji to the Slack post so other maintainers know it is being assessed.
We do not want multiple maintainers to work on fulfilling the request.Next, ensure that all the following conditions are met:
gitlab-org/gitlab
using
merged results pipelines).master
.master
incidents.
See the "Triage DRI Responsibilities" steps above for more details.Next, add a comment to the merge request mentioning that the merge request will be merged
during a broken master
, and link to the broken master
incident. For example:
Merge request will be merged while `master` is broken.
Failure in <JOB_URL> happens in `master` and is being worked on in <INCIDENT_URL>.
Next, merge the merge request:
gitlab-org/gitlab
project.master
mirrors#master-broken-mirrors
was created to remove duplicative notifications from the #master-broken
channel which provides a space for Release Managers and the Engineering Productivity team to monitor failures for the following projects:
The #master-broken-mirrors
channel is to be used to identify unique failures for those projects and flaky failures are not expected to be retried/reacted to in the same way as #master-broken
.
We run JiHu validation pipelines in some of the merge requests, and it can be broken at times. When this happens, check What to do when the validation pipeline failed for more details.
Security issues are managed and prioritized by the security team. If you are assigned to work on a security issue in a milestone, you need to follow the Security Release process.
If you find a security issue in GitLab, create a confidential issue mentioning the relevant security and engineering managers, and post about it in #security
.
If you accidentally push security commits to gitlab-org/gitlab
, we recommend that you:
#releases
. It may be possible to execute a garbage collection (via the Housekeeping task in the repository settings) to remove the commits.For more information on how the entire process works for security releases, see the documentation on security releases.
workflow::in dev
label to the issue.workflow::in review
. If multiple people are working on the issue or multiple workflow labels might apply, consider breaking the issue up. Otherwise, default to the workflow label farthest away from completion.workflow::verification
, to indicate all the development work for the issue has been done and it is waiting to be deployed and verified. We will use this label in cases where the work was requested to be verified by product OR we determined we need to perform this verification in production.Be sure to read general guidelines about issues and merge requests.
For larger issues or issues that contain many different moving parts, you'll be likely working in a team. This team will typically consist of a backend engineer, a frontend engineer, a Product Designer and a product manager.
In the spirit of collaboration and efficiency, members of teams should feel free to discuss issues directly with one another while being respectful of others' time.
Avoid adding configuration values in the application settings or in
gitlab.yml
. Only add configuration if it is absolutely necessary. If you
find yourself adding parameters to tune specific features, stop and consider
how this can be avoided. Are the values really necessary? Could constants be
used that work across the board? Could values be determined automatically?
See Convention over Configuration
for more discussion.
Start working on things with the highest priority in the current milestone. The priority of items are defined under labels in the repository, but you are able to sort by priority.
After sorting by priority, choose something that you’re able to tackle and falls under your responsibility. That means that if you’re a frontend developer, you work on something with the label frontend
.
To filter very precisely, you could filter all issues for:
CI/CD
, Discussion
, Quality
, frontend
, or Platform
Use this link to quickly set the above parameters. You'll still need to filter by the label for your own team.
If you’re in doubt about what to work on, ask your lead. They will be able to tell you.
It's every developers' responsibilities to triage and review code contributed by the rest of the community, and work with them to get it ready for production.
Merge requests from the rest of the community should be labeled with the Community contribution
label.
When evaluating a merge request from the community, please ensure that a relevant PM is aware of the pending MR by mentioning them.
This should be to be part of your daily routine. For instance, every morning you could triage new merge requests from the rest of the community that are not yet labeled Community contribution
and either review them or ask a relevant person to review it.
Make sure to follow our Code Review Guidelines.
Labels are described in our Contribution guide and Product Development Flow.
GitLab.com is a very large instance of GitLab Enterprise Edition. It runs release candidates for new releases, and sees a lot of issues because of the amount of traffic it gets. There are several internal tools available for developers at GitLab to get data about what's happening in the production system:
There is extensive monitoring publicly available for GitLab.com. For more on this and related tools, see the monitoring handbook.
GitLab Inc has to be selective in working on particular issues. We have a limited capacity to work on new things. Therefore, we have to schedule issues carefully.
Product Managers are responsible for scheduling all issues in their respective product areas, including features, bugs, and tech debt. Product managers alone determine the prioritization, but others are encouraged to influence the PMs decisions. The UX Lead and Engineering Leads are responsible for allocating people making sure things are done on time. Product Managers are not responsible for these activities, they are not project managers.
Direction issues are the big, prioritized new features for each release. They are limited to a small number per release so that we have plenty of capacity to work on other important issues, bug fixes, etc.
If you want to schedule an issue with the Seeking community contributions
label, please remove the label first.
Any scheduled issue should have a team label assigned, and at least one type label.
To request scheduling an issue, ask the responsible product manager
We have many more requests for great features than we have capacity to work on.
There is a good chance we’ll not be able to work on something.
Make sure the appropriate labels (such as customer
) are applied so every issue is given the priority it deserves.
Teams (Product, UX, Development, Quality) continually work on issues according to their respective workflows.
There is no specified process whereby a particular person should be working on a set of issues in a given time period.
However, there are specific deadlines that should inform team workflows and prioritization.
Suppose we are talking about milestone m
that will be shipped in month M
(on the 22nd).
We have the following deadlines:
M-1, 4th
(at least 14 days before milestone m
begins):
type::maintenance
issues per cross-functional prioritizationtype::bug
issues per cross-functional prioritizationM-1, 10th
product manager, taking into consideration prioritization input from development EM, Quality, and UX to create a plan of issues for the upcoming milestone
m
; label deliverable
applied.M-1, 13th
(at least 5 days before milestone m
begins):
m
; label deliverable
applied.M-1, 16th
(at least 1 day before milestone m
begins):
M-1, 18th
(or next business day, milestone m
begins): Kick off! 📣
m
beginsM-1, 24th
The development lead for each stage/section coordinates a stage/section level review with the quad cross-functional dashboard review process. After the stages/section level reviews are complete, the VP of Development coordinates a summary review with the CTO, VP of Product, VP of UX, and VP of Quality.M-1 26th
: GitLab Bot opens Group Retrospective issue for the current milestone.M, 17th
:
m
issues with docs have been merged into master.m
release. See feature flags.m
release. See release timelines.m
is expired.M, 21st
:
M, 19th
, or M, 20th
, or M, 21st
:
M, 22nd
: Release Day 🚀
M, 23rd
(the day after the release):
m
starts. This includes regular and security patch releases.m
issues and merge requests are automatically moved to milestone m+1
, with the exception of ~security
issues.M, 24th
: Moderator opens the Retrospective planning and execution issue.M, 24th
to M+1, 3rd
: Assignees of Group Retrospective issues summarize the discussion, ensure corrective actions are taken and a DRI is assigned to each. Actions related to participation in section-based Retrospective Summaries are taken.M, 26th
:
M, 28th
:
Refer to release post content reviews for additional deadlines.
Note that deployments to GitLab.com are more frequent than monthly major/minor releases on the 22nd. See auto deploy transition guidance for details.
Team members use labels to track issues throughout development. This gives visibility to other developers, product managers, and designers, so that they can adjust their plans during a monthly iteration. An issue should follow these stages:
workflow::in dev
: A developer indicates they are developing an issue by applying the in dev
label.workflow::in review
: A developer indicates the issue is in code review and UX review by removing the in dev
label, and applying the in review
label.workflow::verification
: A developer indicates that all the development work for the issue has been done and is waiting to be deployed and verified.workflow::complete
: A developer indicates the issue has been verified and everything is working by adding the workflow::complete
label and closing the issue.At the beginning of each release, we have a kickoff meeting, publicly livestreamed to YouTube. In the call, the Product Development team (PMs, Product Designers, and Engineers) communicate with the rest of the organization which issues are in scope for the upcoming release. The call is structured by product area with each PM leading their part of the call.
The notes are available in a publicly-accessible Google doc. Refer to the doc for details on viewing the livestream.
The purpose of our retrospective is to help each Product Group, and the entire R&D cost center at GitLab learn and improve as much as possible from every monthly release.
Each retrospective consist of three parts:
Timeline
M-1 26th
: GitLab Bot opens Group Retrospective issue for the current milestone.M, 21st
: Group Retrospectives should be held.M, 24th
: Moderator opens the Retrospective planning and execution issue and communicates a reminder in R&D quad slack channels.M, 24th
to M+1, 3rd
: Participants complete the Retrospective planning and execution issue, add their notes to the retro doc, and suggest and vote on discussion topics.M+1, 4th
: Moderator records the Retrospective Summary video and announces the video and discussion topics.M+1, 6th
: Retrospective Discussion is held.
M+1, 6th
falls in a weekend.M+1, 6th
if M+1, 6th
is a holiday.Moderator
The moderator of each retrospective is responsible for:
The job of a moderator is to remain objective and is focused on guiding conversations forward. The moderator for each retrospective is assigned by the VP Development in each milestone.
Retrospective planning and execution issue
For each monthly release, a Retrospective planning and execution issue (example) is opened by the moderator to help us coordinate this work.
Create the Retrospective planning and execution issue by selecting the 'product-development-retro' issue template in the 'www-gitlab-com' project.
Title the issue <MILESTONE VERSION #> Team Retrospectives.
Set the due date to 2 days before the Retrospective Discussion to encourage team members to contribute prior to recording of the Retrospective Summary video.
Retro doc
The retro doc is a Google Doc we use to collaborate on for our Retrospective Summary and Retrospective Discussion.
At the end of every release, each team should host their own retrospective. For details on how this is done, see Group Retrospectives.
Note - we are currently conducting an experiment with section-based Retrospective Summaries. During such time we will not be conducting an R&D-wide Retrospective Summary.
The Retrospective Summary is a short pre-recorded video which summarizes the learnings across all Group Retrospectives (example video, example presentation).
Once all Group Retrospectives are completed, each team inputs their learnings into a single publicly-accessible retro doc. The moderator then pre-records a video of the highlights. This video is then announced in the Retrospective planning and execution issue along with the #whats-happening-at-gitlab slack channel. In line with our value of transparency, we also post this video to our public GitLab Unfiltered channel.
Steps for participants
Steps for the moderator
The Retrospective Discussion is a 25 minute live discussion among participants where we deep dive into discussion topics from our Group Retrospectives (example). In line with our value of transparency, we livestream this meeting to YouTube and monitor chat for questions from viewers. Please check the retro doc for details on joining the livestream.
Discussion Topics
For each retrospective discussion, we aim to host an interactive discussion covering two discussion topics. We limit this to two topics due to the length of the meeting.
The discussion topics stem from our Group Retrospective learnings and should be applicable to the majority of participants.
Discussion topics are suggested by participants by commenting on the Retrospective planning and execution issue. Participants can vote on these topics by adding a :thumbsup: reaction. The two topics with the most :thumbsup: votes will be used as the discussion topics. If there are not enough votes or if the discussion topics are not relevant to the majority of participants, the moderator can choose other discussion topics.
Meeting Agenda
Steps for participants
M+1, 3rd
.M+1, 4th
, begin adding your comments to the retro doc.Steps for the moderator
M+1, 3rd
. Take note of which discussion topics have the most votes at this time. If there are not enough votes or if you deem the discussion topics as not relevant to the majority of participants, please choose other discussion topics.Monthly retrospectives are usually performed in a confidential issue made public upon close. Content of these issues while public aligns with GitLab SAFE Framework.
Where unSAFE information must be discussed in a retrospective, Internal Notes should be utilized in order to adhere to SAFE Guidelines. Internal notes remain confidential to participants of the retrospective even after the issue is made public, including Guest users of the parent group.
Examples of information that should remain Confidential per SAFE guidelines are any company confidential information that is not public, or any data that reveals information not generally known or not available externally which may be considered sensitive information. Specific examples of information that should be carefully submitted to the retrospective include impact on revenue, information relating to the security of the platform, or specific customer data.
At the end of each retrospective the Engineering Productivity team is responsible for triaging improvement items identified from the retrospective. This is needed for a single owner to be aware of the bigger picture technical debt and backstage work. The actual work can be assigned out to other teams or engineers to execute.
The Moderator for the Retrospective Summary is chosen on a quarterly basis. For FY22 we have selected 4 moderators from across Engineering and Product. The moderators are:
During FY22 Q4 (the 14.4, 14.5, 14.6 Retrospectives) we will conduct an experiment where we perform retrospective summaries at the Section level instead of an R&D-wide retrospective summary. Section level leaders in Product and Development are the DRIs for retrofitting the current retrospective summary process for their section and documenting their process for doing so.
As GitLab has grown, there have become too many layers between a group retrospective and the company-wide retrospective. Performing retrospective summaries at the Section level will increase our rate of learning and encourage broader collaboration between stable counterparts across the R&D organization.
We'll consider this experiment a success if:
While leaders are available in the categories page (and subject to change) - we explicitly call out the DRIs for each section in this experiment.
Discretion is provided to Section leaders on how to conduct a section retrospective discussion. A good starting point would be to follow the current handbook and issue template recommendations for our R&D wide retrospective. Consider creating section versions of the issue template and discussion doc.
Engineering Managers are responsible for capacity planning and scheduling for their respective teams with guidance from their counterpart Product Managers.
To ensure hygiene across Engineering, we run scheduled pipelines to move
unfinished work (open issues and merge requests) with the expired milestone to
the next milestone, and label ~"missed:x.y"
for the expired milestone.
Additionally, label ~"missed-deliverable"
whenever ~"Deliverable"
is
presented.
This is currently implemented as part of our automated triage operations. Additionally, issues with the ~Deliverable
label which have a milestone beyond current +1, will have the ~Deliverable
label removed.
We keep the milestone open for 3 months after it's expired, based on the release and maintenance policy.
The milestone cleanup is currently applied to the following groups and projects:
Milestones closure is in the remit of the Delivery team. At any point in time a release might need to be created for an active milestone,and once that is no longer the case, the Delivery team closes the milestone.
The milestone cleanup will happen one weekday before the 22nd (release day).
The following is observed to account for the weekends:
These actions will be applied to open issues:
~"missed:x.y"
.~"missed-deliverable"
will also be added whenever ~"Deliverable"
is presented.Milestones are closed when the Delivery team no longer needs to create a backport release for a specific milestone.
Both the monthly kickoff and retrospective meetings are publicly streamed to the GitLab Unfiltered YouTube Channel. The EBA for Engineering is the moderator and responsible for initiating the Public Stream or designating another moderator if EBA is unable to attend.
When working in GitLab (and in particular, the GitLab.org group), use group labels and group milestones as much as you can. It is easier to plan issues and merge requests at the group level, and exposes ideas across projects more naturally. If you have a project label, you can promote it to a group milestone. This will merge all project labels with the same name into the one group label. The same is true for promoting group milestones.
We definitely don't want our technical debt to grow faster than our code base. To prevent this from happening we should consider not only the impact of the technical debt but also a contagion. How big and how fast is this problem going to be over time? Is it likely a bad piece of code will be copy-pasted for a future feature? In the end, the amount of resources available is always less than amount of technical debt to address.
To help with prioritization and decision-making process here, we recommend thinking about contagion as an interest rate of the technical debt. There is a great comment from the internet about it:
You wouldn't pay off your $50k student loan before first paying off your $5k credit card and it's because of the high interest rate. The best debt to pay off first is one that has the highest loan payment to recurring payment reduction ratio, i.e. the one that reduces your overall debt payments the most, and that is usually the loan with the highest interest rate.
Technical debt is prioritized like other technical decisions in product groups by product management.
For technical debt which might span, or fall in gaps between groups they should be brought up for a globally optimzed prioritization in retrospectives or directly with the appropriate member of the Product Leadership team. Additional avenues for addressing technical debt outside of product groups are Rapid Action issues and working groups.
Sometimes there is an intentional decision to deviate from the agreed-upon MVC, which sacrifices the user experience. When this occurs, the Product Designer creates a follow-up issue and labels it UX debt
to address the UX gap in subsequent releases.
For the same reasons as technical debt, we don't want UX debt to grow faster than our code base.
These issues are prioritized like other technical decisions in product groups by product management. You can see the number of UX debt issues on the UX Debt dashboard.
As with technical debt, UX debt should be brought up for globally optimized prioritization in retrospectives or directly with the appropriate member of the Product Leadership team.
UI polish issues are visual improvements to the existing user interface, touching mainly aesthetic aspects of the UI that are guided by Pajamas foundations. UI polish issues generally capture improvements related to color, typography, iconography, and spacing. We apply the UI polish
label to these issues. UI polish issues don't introduce functionality or behavior changes to a feature.
Open merge requests sometimes become idle (not updated by a human in more than a month). Once a month, engineering managers will receive an Merge requests requiring attention triage issue
that includes all (non-WIP/Draft) MRs for their group and use it to determine if any action should be taken (such as nudging the author/reviewer/maintainer). This assists in getting merge requests merged in a reasonable amount of time which we track with the Open MR Review Time (OMRT) and Open MR Age (OMA) performance indicators.
Open merge requests may also have other properties that indicate that the engineering manager should research them and potentially take action to improve efficiency. One key property is the number of threads, which, when high, may indicate a need to update the plan for the MR or that a synchronous discussion should be considered. Another property is the number of pipelines, which, when high, may indicate a need to revisit the plan for the MR. These metrics are not yet included in an automatically created a triage issue. However, they are available in a Sisense dashboard. Engineering managers are encouraged to check this dashboard for their group periodically (once or twice a month) in the interim.
Security is our top priority. Our Security Team is raising the bar on security every day to protect users' data and make GitLab a safe place for everyone to contribute. There are many lines of code, and Security Teams need to scale. That means shifting security left in the Software Development LifeCycle (SDLC). Each team has an Application Security Stable Counterpart who can help you, and you can find more secure development help in the #sec-appsec
Slack channel.
Being able to start the security review process earlier in the software development lifecycle means we will catch vulnerabilities earlier, and mitigate identified vulnerabilities before the code is merged. You should know when and how to proactively seek an Application Security Review. You should also be familiar with our Secure Coding Guidelines.
We are fixing the obvious security issues before every merge, and therefore, scaling the security review process. Our workflow includes a check and validation by the reviewers of every merge request, thereby enabling developers to act on identified vulnerabilities before merging. As part of that process, developers are also encouraged to reach out to the Security Team to discuss the issue at that stage, rather than later on, when mitigating vulnerabilities becomes more expensive. After all, security is everyone's job. See also our Security Paradigm
From time to time, there are occasions that engineering team must act quickly in response to urgent issues. This section describes how the engineering team handles certain kinds of such issues.
Not everything is urgent. See below for a non-exclusive list of things that are in-scope and not in-scope. As always, use your experience and judgment, and communicate with others.
A bi-weekly performance refinement session is held by the Development and QE teams jointly to raise awareness and foster wider collaboration about high-impact performance issues. A high impact issue has a direct measurable impact on GitLab.com service levels or error budgets.
The Performance Refinement issue board is reviewed in this refinement exercise.
bug::performance
.Milestone
or the label workflow::ready for development
is missing.Milestone
and the label workflow::ready for development
.The infradev process is established to identify issues requiring priority attention in support of SaaS availability and reliability. These escalations are intended to primarily be asyncronous as timely triage and attention is required. In addition to primary management through the Issues, any gaps, concerns, or critical triage is handled in the SaaS Availability weekly standup.
The infradev issue board is the primary focus of this process.
Infradev
label.Priority
and apply the corresponding label as appropriate.Infradev
label to the new issues.Severity
and Priority
labels to the new issues. The labels should correspond to the importance of the follow-on work.(To be completed primarily by Development Engineering Management)
Issues are nominated to the board through the inclusion of the label infradev
and will appear on the infradev board.
Milestone
or the label workflow::ready for development
is missing.
Milestone
and the label workflow::ready for development
.Issues with ~infradev ~severity::1 ~priority::1 ~production request
labels applied require immediate resolution.
~infradev
issues requiring a ~"breaking change" should not exist. If a current ~infradev
issue requires a breaking change then it should split into two issues. The first issue should be the immediate ~infradev
work that can be done under current SLOs. The second issue should be ~"breaking change" work that needs to be completed at the next major release in accordance with handbook guidance. Agreement from development DRI as well as the infrastructure DRI should be documented on the issue.
Infradev issues are also shown in the monthly Error Budget Report.
Triage of infradev Issues is desired to occur asynchronously. These points below with endure that your infradev issues gain maximum traction.
infradev
label to architectural problems, vague solutions, or requests to investigate an unknown root-cause.