We provide confidence in software through by increasing visibility into code change impacts on software in GitLab.
The Verify:Pipeline Insights Group provides visibility into the results of code changes on software applications. We aim to easily integrate layers testing and verification into their GitLab CI workflow including:
See all current and planned category maturity in the Maturity page.
We want software teams to feel confident that the changes they introduce into their code are safe and conformant.
We measure the value we contribute by using a Product Performance Indicator. Our current PI for the Pipeline Insights group is the GMAU (internal handbook). This is a rolling count of unique users who have triggered a pipeline that uploads a test or coverage report. This is not currently instrumented and we are tracking progress of instrumentation in gitlab&4528.
This funnel represents the customer journey and the various means a product manager may apply a Performance Indicator metric to drive a desired behavior in the funnel. This framework can be applied to any of the categories being worked on by Verify:Pipeline Insights. The current priority is to increase Activation within the Code Testing and Coverage Category.
As part of the FY22 SaaS First investment theme and our committment to building a reliable service we use error budgets to track availability and help guide how much investment we make milestone to milestone in reliability. More information is shared on the Ops Section Performance Indicators (internal handbook).
This section will list the top three most recent, exciting accomplishments from the team.
The following people are permanent members of the Verify:Pipeline Insights group:
To find our stable counterparts look at the Pipeline Insights product category listing.
You can view and contribute to our current list of JTBD and job statements here.
Like most GitLab backend teams we spend a lot of time working in Rails on the main GitLab app. Familiarity with Docker and Kubernetes is also useful on our team.
Our active feature flags can be found on the feature flag dashboard. Make sure to apply the team_group
testing filter.
(Sisense↗) We also track our backlog of issues, including past due security and infradev issues, and total open SUS-impacting issues and bugs.
(Sisense↗) MR Type labels help us report what we're working on to industry analysts in a way that's consistent across the engineering department. The dashboard below shows the trend of MR Types over time and a list of merged MRs.
Also known as the Three Amigos process, Quad-planning is the development of a shared understanding about the scope of work in an issue and what it means for the work represented in an issue to be considered done. When an SET counterpart is assigned, we will perform Quad-planning earlier in the planning process than described in the product workflow, beginning when the PM develops initial acceptance criteria. Once this occurs, the PM will apply the quad-planning:ready
label. At this point, asynchronous collaboration will begin between product, engineering, UX, and quality. Supplying these distinct viewpoints at this early stage of planning is valuable to our team. This can take place more than one milestone into the future. Once acceptance criteria is agreed upon, the quad-planning:complete
label is applied.
We use a release planning issue to plan our release-level priorities over each milestone. This issue is used to highlight deliverables, capacity, team member holidays, and more. This allows team members and managers to see at a high-level what we are planning on accomplishing for each release, and serves as a central location for collecting information.
Before issues can be moved from the workflow::planning breakdown
status into the workflow::ready for development
status, they must have a weight greater than 0 applied to the issue.
The Product Manager (PM) and Engineering Manager (EM) will use milestone planning issues to identify issues that need to be weighed in advance of the target milestone.
We would like to give engineers one full milestone to refine and weight the issues.
The EM assigns issues to engineers to add a weight. An issue weight is determined based on the complexity in the following criteria:
Based on these criteria, an issue can have one of the following weights:
Weight | Description |
---|---|
1: Trivial | Issues that are very well understood. The exact solution is already known and straight forward. The scope of change is very isolated. Examples are documentation updates, simple regressions, and bugs or technical debt that can be fixed with a few lines of code. |
2: Small | Issues that are well understood and have an outlined solution. No surprises are expected. No coordination with other teams or people is required. Examples are simple features, like a new API endpoint to expose existing data or functionality, or regular bugs or performance issues. |
3: Medium | Issues that are well understood and have an outlined solution. These issues require external team involvement or coordination to be released, such as a feature flag. Examples are regular features, potentially with a backend and frontend component, or most bugs or performance issues. |
5: Large | Issues that are known to be complex. A solution has been outlined. There are many major edge cases that need to be catered for. Surprises are expected. Extensive coordination with other teams is required. Careful release process needs to be considered. These issues may have potential for adverse performance impact or catastrophic failures. They may also involve more one components (backend, frontend, gitaly, workhorse, runner, etc) or changes the interaction between the components. Examples are issues that changes API contracts requiring backward compatibility, requires multiple feature flags to be safely released. It could also be issues where the team does not have any existing expertise or knowledge, or require changes in components that the team does not usually work on. |
8: Unknown | An issue that is weight 8 will not be scheduled and instead should be investigated further in order to be broken down into smaller issues Examples are bugs that are not well understood or easily replicated, bugs or features that do not have a suggested solution. |
If the weight of an issue cannot be determined within an hour, create a separate investigation issue for an in-depth investigation.
There may be times in which an issue requires a new design proposal. When those issues are also labeled as ~direction
items we create a separate design issue for them for the Product Designer to work through. The reason for this is to be as accurate as possible in displaying on the Upcoming Releases page when the feature will be implemented.
When a separate design issue is created, we label it [~UX](https://gitlab.com/gitlab-org/gitlab/-/issues?label_name[]=UX)
and [~workflow:design](https://gitlab.com/gitlab-org/gitlab/-/issues?label_name[]=workflow%3A%3Adesign)
. The issue is titled Design: [feature description]
. See example issue. Engineers, Product Manager, and Product Designer will refine design issues by providing guidance and facilitating design discussions before issues are ready for development.
Once the design issue is ready for development, Product Manager and Product Designer will close the issue and cross-link it in the implementation issue (and repeat this in the design issue the other way).
In the process of refinement we may discover a new feature will require a blueprint or the team feels input from maintainers will help scope down the problem, ensure the feature is performant and/or reduce future technical debt. When this happens the team will create a Technical Investigation issue for the investigation. This issue will be assigned to one team member. That team member should spend the minimum amount of time to create documentation, a poc, or some other artifact that clarifies the approach to the problem, ideally in less than 5 working days. This will help us to gather information, validate the solution with others, and propose a plan to execute. They will answer specific questions outlined in the Technical Investigation issue before work on the feature is started. This process is analogous to the concept of a Spike.
When possible, the assigned team member is encouraged to schedule synchronous time with another developer to pair on the investigation and publishing of the results (Example Technical Investigation issue gitlab#336617). By default Technical Investigation issues are weighted at a 2 and we timebox them to 3 business days from start to presentation of data. Team members may change this weight and/or time frame at their discretion.
We limit the number of of technical investigation issues assigned in each milestone to two so that we can maintain overall velocity and MR Rate.
Error Budgets for stage groups have been established in order to help groups identify and prioritize issues that are impacting customers and infrastructure performance.
The Pipeline Insights group Error Budget dashboard is used to identify issues that are contributing to the Pipeline Insights group's error budget spend.
The engineering manager will review the error budget dashboard weekly to determine whether we're exceeding our budget, determine what (if anything) is contributing to our error budget spend, and create issues addressing root cause for product manager prioritization. Issues created to address error budget spend should be created using appropriate labels and prioritized according to our technical debt process. Other issues may be created as a result of the Ops Section SaaS Reviews and also prioritized using the technical debt process.
We follow the company guidance in how we prioritize technical debt and UX debt. In order to manage this effectively we have decided to keep track of technical and UX debt and feature maintenance issues that are ready to be worked on in the "~workflow::scheduling" column, prioritized by Product, but informed by the impact these issues will have on our future velocity. We will try to dedicate a certain percentage (~30%) of our capacity by weight in each milestone to paying down technical and UX debt. We do not consider bugs to be debt, and they are prioritized as part of the remaining capacity by weight.
Engineering managers apply the Deliverable
label on issues that meet the following criteria:
We will use the Stretch
label if we feel the issue meets most of the criteria but either contains some known unknowns, or it's uncertain whether we will have the capacity to complete the work in the current milestone. If an issue with a Stretch
label is carried over into the next milestone, its label will change to Deliverable
. The position in the issue board is what confers priority, rather than the presence of a Deliverable
label. However most high priority issues are Deliverable
.
There may be some issues in a milestone with neither Deliverable
nor Stretch
labels but will strive to have the majority of issues labeled with one of these.
Before the team will accept an issue into a milestone for work it must meet these criteria:
Issues that depend on another issue to be completed before they can be validated on Canary
are considered blocked, and should have the ~workflow::blocked
label applied. The issues should also be marked as blocked within the related issues section of the issues.
We utilize a process wherein issues that are follow-ups to other issues and are being worked on in the same milestone they are created will have a ~follow-up
label added to them. Examples of items the team may create and label as ~follow-up
include but is not limited to feature scope-creep, non-blocking requests from code review, additional UI polish, non-blocking refactoring that can be done, Low Priority (P2 or lower) bug fixes, etc.
During each milestone, we create a Release Post Checklist issue that is used by team members to help track the progress of the team release posts. We use the checklist issue to associate release post merge requests with links to the implementation issue, links to updated documentation, and list which engineers are working on each issue. The checklist issue provides a single place to see all this information.
Unless specifically mentioned below, the Verify:Pipeline Insights group follows the standard engineering, product, and UX workflows.
Verify:Pipeline Insights team members are encouraged to start looking for work starting Right to left in the milestone board. This is also known as "Pulling from the right". If there is an issue that a team member can help along on the board, they should do so instead of starting new work. This includes conducting code review on issues that the team member may not be assigned to if they feel that they can add value and help move the issue along the board.
Specifically this means, in order:
workflow::in review
workflow::in development
columnworkflow::ready for development
column OR an item the team member investigated to apply the estimated weight if unfamiliar with the top itemThe goal with this process is to reduce WIP. Reducing WIP forces us to "Start less, finish more", and it also reduces cycle time. Engineers should keep in mind that the DRI for a merge request is the author(s), just because we are putting emphasis on the importance of teamwork does not mean we should dilute the fact that having a DRI is encouraged by our values.
We use a series of labels to indicate the highest priority issues in the milestone.
VerifyP1
and group::pipeline insights
.VerifyP1
issues have been picked up and are in workflow:in dev
or beyond, we have VerifyP2
and VerifyP3
to signal issues that will become VerifyP1
issues in the following milestones.Any future product priorities (VerifyP2
or VerifyP3
labeled issues) will typically be in workflow::ready for development
and have designs ready, and ideally already be weighted with proposals for implementation. Beyond the product VerifyPX
priorities, the ready for development
column will be stack ranked (or at least reviewed) daily by the Product Manager, so that each team member can pull from the top of the column expecting that it is already ordered in priority.
Issues in "Planning Breakdown" and "Ready for Development" are in top-to-bottom priority order on the planning board. Issues further to the right on the issue board are not in vertical priority order. Rather, the further to the right an issue is on the board, the higher the priority which follows our "Pull from the right" philosophy of working.
Code reviews follow the standard process of using the reviewer roulette to choose a reviewer and a maintainer. The roulette is optional, so if a merge request contains changes that someone outside our group may not fully understand in depth, it is encouraged that a member of the Verify:Pipeline Insights team be chosen for the preliminary review to focus on correctly solving the problem. The intent is to leave this choice to the discretion of the engineer but raise the idea that fellow Verify:Pipeline Insights team members will sometimes be best able to understand the implications of the features we are implementing. The maintainer review will then be more focused on quality and code standards.
We also recommend that team members take some time to review each others merge requests even if they are not assigned to do so, as described in the GitLab code review process. It is not necessary to assign anyone except the initial domain reviewer to your Merge Request. This process augmentation is intended to encourage team members to review Merge Requests that they are not assigned to. As a new team, reviewing each others merge requests allows us to build familiarity with our product area, helps reduce the amount of investigation that needs to be done when implementing features and fixes, and increases our lottery factor. The more review we can do ourselves, the less work the maintainer will have to do to get the merge request into good shape.
This tactic also creates an environment to ask for early review on a WIP merge request where the solution might be better refined through collaboration and also allows us to share knowledge across the team.
As part of the code review process, when an MR makes a user-facing change (no matter how small), it should be reviewed by a Product Designer. When in doubt, if the MR is related to an issue with the ~UX label, involve a Product Designer. Include screenshots or screen recordings of the changes in the MR description whenever possible to make the review process easier for everyone.
For iterations on features behind feature flags, even when the changes won't be user-facing right away, involve a Product Designer. If they feel that it would be more efficient to do so, the Product Designer may choose to defer their review of the feature-flagged feature until it is closer to being complete. This option gives them the flexibility to prioritize their workload as they see fit and avoid some of the noise generated by the MR review process.
On Track
within the week.
On Track
status.On Track
today.
On Track
status in your message.When an engineer is actively working (workflow of ~workflow::"In dev" or further right on current milestone) on an issue they will periodically leave status updates as top-level comments in the issue. The status comment should include the updated health status, any blockers, notes on what was done, if review has started, and anything else the engineer feels is beneficial. If there are multiple people working on it also include whether this is a front end or back end update. An update for each of MR associated with the issue should be included in the update comment. Engineers should also update the health status of the issue at this time.
This update need not adhere to a particular format. Some ideas for formats:
Health status: (On track|Needs attention|At risk)
Notes: (Share what needs to be shared specially when the issue needs attention or is at risk)
Health status: (On track|Needs attention|At risk)
What's left to be done:
What's blocking: (probably empty when on track)
## Update <date>
Health status: (On track|Needs attention|At risk)
What's left to be done:
#### MRs
1. !MyMR1
1. !MyMR2
1. !MyMR3
There are several benefits to this approach:
Some notes/suggestions:
In addition to the steps documented for developing with feature flags at GitLab Verify:Pipeline Insights engineers monitor their changes' Insights on infrastructure using dashboards and logs where possible. Because feature flags allow engineers to have complete control over their code in production it also enables them to take ownership of monitoring the Insights their changes have against production infrastructure. In order to monitor our changes we use this helpful selection of dashboards and specifically the Rails controller dashboard (Internal Only) for monitoring our changes in production. Metrics we evaluate include latency, throughput, CPU usage, memory usage, and database calls, depending on what our change's expected Insights will be and any considerations called out in the issue.
The goal of this process is to reduce the time that a change could potentially have an Insights on production infrastructure to the smallest possible window. A side benefit of this process is to increase engineer familiarity with our monitoring tools, and develop more experience with predicting the outcomes of changes as they relate to infrastructure metrics.
After an engineer has ensured that the Definition of Done is met for an issue, they are the ones responsible for closing it. The engineer responsible for verifying an issue is done is the engineer who is the DRI for that issue, or the DRI for the final merge request that completes the work on an issue.
We want to create a welcoming environment for everyone who is interested in contributing to GitLab in anyway be it code, documentation, UX, or others.
If an issue is being resolved in a merge request that is made by a community contributor, the engineering manager will assign the associated issue to themselves to keep better track of it. The engineering manager will help to coach the community contributor through the merge request process, or be responsible for delegating that coaching to a team member engineer.
When an engineer discovers a potential technical concern (e.g. performance, scalability, usability, etc.), we encourage the engineer to surface that as soon as possible by creating an issue and labeling with ~performance
, ~usability
, etc. so we can avoid or mitigate future issues.
As a 3 6 month trial (ending in December 2021), we are going to be using an epic to collect each issue created as a technical concern.
When creating an issue for this epic, the engineer details the problem as well as the potential severity of the issue so that we can properly prioritize the issue into future milestones.
Once we schedule the issue, we remove the issue from the technical concerns epic so that it can be placed in a more feature specific epic.
After the 3 month trial is over, we will have a group retrospective to determine whether we should use this process permanently or not.
We meet on a weekly cadence for one synchronous meeting.
This synchronous meeting is to discuss anything that is blocking, or notable from the past week. This meeting acts as a team touchpoint.
We use geekbot integrated with Slack for our daily async standup. The purpose of the daily standup meeting is to keep the team informed about what everyone is working on, and to surface blockers so we can eliminate them. The standup bot will run at 10am in the team members local time and ask 2 questions:
We use a GitLab issue in this project for our monthly retrospective. The issue is created automatically towards the end of the current milestone. The purpose of the monthly retrospective issue is to reflect on the milestone and talk about what went well, what didn't go so well, and what we can do better. Instead of waiting until the end of the milestone to add items to the retrospective issue, we encourage team members to add comments throughout the month. We have a slack reminder on our #g_pipeline-Insights channel to remind us to add items to the issue each Friday.
We have a monthly synchronous 30-minute think big meeting, followed the next week by a monthly 30-minute think small meeting on the same topic of the previous think big meeting. This pair of meetings is modeled after the Gitlab Product Manager deep dive interview. The purpose of this meeting is to discuss the vision, product roadmap, user research, design, and delivery around the Pipeline Insights features. The goal of this meeting will be to align the team on our medium to long-term goals and ensure that our short-term goals are leading us in that direction. This meeting is useful for aligning the team with its stable counterparts and ensuring that engineers have an understanding of the big picture and so they know how their work fits into the long-term goals of the team.
Issues worked on by the Pipeline Insights group have a group label of ~"group::pipeline insights". Issues that contribute to the verify stage of the devops toolchain have the ~"devops::verify" label.