The Static Analysis group at GitLab is charged with developing the following solutions for customer software repositories:
The Static Analysis group is largely aligned with GitLab's Product Development Flow, however there are some notable differences in how we seek to deliver software. The engineering team predominantly concerns itself with the delivery of software, which is the portion of the workflow states where we deviate the most. What follows is how we manage the handoff from product management to engineering to deliver software.
Issues worked by this team can span analyzers, vendored templates, and GitLab's Rails monolith.
GitLab has a labeling convention for issues and Merge Requests. We follow this convention, though there are specific labels required to route artifacts to us. We use these labels to filter issues meant for us on our issue boards. They are also used for metrics and KPI reporting.
Label | Meaning |
---|---|
~section::sec | Identifies the issue or MR as belonging to the Sec Section's roadmap. |
~devops::secure | Identifies the issue or MR as belonging to the Secure Stage's roadmap. |
~group::static analysis | Identifies the Static Analysis group as the collection of individuals who will work on the issue or MR. |
~Category:SAST | Identifies the issue or MR as being part of the SAST feature category. |
~Category:Secret Detection | Identifies the issue or MR as being part of the Secret Detection feature category. |
~Category:Code Quality | Identifies the issue or MR as being part of the Code Quality feature category. |
~backend | Identifies the issue or MR as being part of GitLab's backend. |
~frontend | Identifies the issue or MR as being part of GitLab's frontend. |
As is the case throughout GitLab, the Static Analysis group works on a monthly planning cadence. We are product-driven and work in response to the priorities identified by Product Management.
However, GitLab milestones start in the second half of each month, which has made a planning cadence organized around weeks in a milestone somewhat difficult to understand as there are many edge cases which are at odds with the Gregorian calendar. Rather than trying to work out week numbers in a milestone, we describe our planning cadence based upon weeks in a month.
Work in a calendar month is mixed between the Current milestone (which will be released on the 22nd of the month) and the Next milestone (which will be released on the 22nd of the following month).
We use planning issues to articulate the themes which should be our top priorities in each release. Themes may include epics or issues.
Product development is a team effort and everyone can contribute. We interpret prioritized themes as what we're being asked to deliver; we use the entire group's strengths to break down and refine those themes into implementable solutions.
The planning issue serves multiple purposes:
The planning issue includes:
Anyone can update the issue to add links, context, or information like DRI assignments, but the DRI for a section should be consulted if a meaningful change is to be made. For instance, the Product Manager should be part of any decision to reorder priorities, and the Technical Writer should be part of any decision to add technical writing scope.
~workflow::planning breakdown
state.
The Frontend Planning meeting is a crucial planning session that takes place during the last quarter of a milestone. It has a heavy focus on Frontend-related issues because they often have many dependencies from other trades.
The purpose of this meeting is to lay out expectations and goals for the milestone that comes after the upcoming milestone, as well as to identify any potential blockers that may arise. By doing so, the team can proactively address any workflow dependencies and stay on top of them.
The goals and blockers for the milestone after the upcoming one are then documented in the planning issue of the upcoming milestone.
By holding regular Frontend Planning meetings, the team can ensure that all Frontend-related issues are identified and addressed proactively, which can help to prevent delays by making sure things are ready to be picked up as planned.
The team aims for a regular cadence of backlog refinement with minimal overhead. One of the approaches we use to eliminate stale issues is our asynchronous MoSCoW prioritization process.
The goal is to determine what should be closed out of the backlog as "wont do." This is not an attempt to weight issues, which comes later after we have determined whether the goals of the issue are worth pursuing.
The asynchronous MoSCoW process will be conducted over a 1 week period during each milestone. It is suggested to limit total item counts to around 12-15.
See %15.3 issue as an example. This issue format can be cloned and applied as-needed.
The Static Analysis Shared Calendar is used to make PTO events are visible to everyone on the team.
Below are the steps to add the calendar to Time Off by Deel:
c_fb285ec72974733f23fd84f70397732e68f7db9abe706c5613f199b6202e379a@group.calendar.google.com
.For GitLab.com, we monitor performance of our code within the Rails application, metrics around our CI build performance, and traffic to our container registries. These dashboards are accessible on the Monitoring page.
Observability is a critical component to any high-availability system and it is recommended for each team member to review each dashboard and ensure they are familiar with their usability and trends.
We also utilize Sisense for long-term trend forecasting. While this is not a recommended observability tool, it can be helpful to recognize trends over time as they surface.
While we follow GitLab's product development flow, our processes as an engineering team most closely resemble kanban. Engineers are empowered to choose issues from the Delivery
Board in their assigned epic swimlane and pull them through the identified states. In addition to the workflow states identified by the company, we are experimenting with the
~workflow::refinement
state. Engineers are expected to use their best judgment as to how issues flow through the board, but the following outcomes are expected at each state.
An issue landing on the delivery board is the means by which work is released to the engineering team for Delivery. This event is the beginning of the process by which the engineers will scrutinize an issue's readiness, estimate it size, and implement the changes necessary to achieve the desired outcomes.
State | Expected Outcomes |
---|---|
~workflow::planning breakdown |
- Issues deemed complete and understood. - Issue split into smallest testable units of value. - We try to split issues vertically rather than horizontally. Splitting vertically means the whole system will do something noticeably different; splitting horizontally results in trying to realize the fullest possible change in an individual component. - If the issue can - and should - be split into separate issues, engineers are empowered to create the new issues, attach them to the epic they are working, and collaborate with product management on if they are included in current scope. |
~workflow::refinement |
- Implementation plan - Relative size applied as weight. |
~workflow::ready for development |
Buffer queue - issue deemed to be ~Deliverable , ~Stretch , or possibly punted to a future iteration. |
~workflow::in dev |
Last MR is up and out of Draft or WIP status. |
~workflow::in review |
Last MR is merged and changes are available in a production environment. |
~workflow::verification |
Changes functionally tested in a production environment. |
~workflow::complete |
Code is verified, the work is complete, and the issue is closed. |
We assign issue weights according to the Secure stage issue weight definitions.
In GitLab, the ~Deliverable
label is referred to as a release scoping label. Applying this label
represents a commitment from the engineering team to realize the work required in the issue within the milestone to which the issue is assigned. This means we decide whether we can commit to
delivering work once an issue is in the workflow::ready for development
state.
The decision on when to use the ~Deliverable
label is made through answering the following questions.
The ~Deliverable
label is applied if the answer to the above questions are yes. The use of this label impacts the group's Say/Do ratio, making the Engineering Manager the directly responsible
individual for this label. However, engineers in Static Analysis are empowered to use their judgment about applying this label and proceeding if they believe the work is achievable. Please
have a conversation with the Engineering Manager if uncertain about how to proceed.
The process for reviewing and maintainer code is documented within our Static Analysis Group Code Review page.
The collection of issues which make up epics represent a sizable amount of work, which we typically seek to limit to approximately 1.5 milestones in total duration. The size and scope of this work can result in previously unseen scope or have unexpected consequences. As a result, we will not immediately kick off work on another epic immediately after completing one. We will allow one week of time for tech debt cleanup, feature stabilization, and engineer slack time to explore topics they encountered which are of interest to them.
We are responsible to ensure that what we deliver is secure. This means that we dogfood GitLab's Security features.
When creating an issue for a vulnerability, please make sure to follow the Engineering Security instructions.
When triaging Unknown
vulnerabilities, they should be assigned a proper severity as a means to decide the
priority they should receive to be resolved. The corresponding priority is taken from issue triage.
Target | Unknown | Critical | High | Medium | Low |
---|---|---|---|---|---|
Dismiss/Confirm Vuln | 72h | 72h | 72h | 1mo | 1mo |
Confirmed Vuln is Resolved | N/A | ~priority::1 | ~priority::2 | ~priority::3 | ~priority::4 |
The following is a description of the type of work and which workstream it flows through.
Work | Responsible Workstream |
Triage of new vulns | This should be done as a part of the MR review that introduces the vulns. |
Triage of existing vulns | This is done by the main maintainer of each of our analyzers as defined in our Release project's issue template. |
Resolution of Critical / High Vulns | These should be a Product-driven priority. |
Resolution of Medium / Low Vulns | This is done by the main maintainer of each of our analyzers as defined in our Release project's issue template. |
As always, contributions are welcome from our community or the current MR coach in rotation.
The process for dismissing a vulnerability as a false positive is as follows:
When creating issues for vulnerability consider adding the following labels besides our normal labels:
When there is a doubt about the severity/priority while creating the issue and severity/priority labels are not added. Then Appsec Escalation Engine could be leveraged to initiate a discussion with the Appsec team. This bot monitor issues that are labeled ~security and not ~test or ~"type::feature". If severity/priority labels are not present, then labels security-sp-label-missing and security-triage-appsec will be added and this issue will be mentioned in the #sec-appsec Slack channel. Then, the appsec stable counterpart for the group or App sec team triage person will pick up the issue and assign a severity as part of the appsec triage rotation.
We are responsible for delivering GitLab's SAST and Secret Detection features, and the analyzers we develop rely heavily upon open source software. This means we can be dramatically affected by changes in those software packages. We will check for updates to these packages once per GitLab release. New versions will be scrutinized for the following aspects:
An issue will be created and prioritized if a breaking change is discovered. Otherwise, dependency updates will be detailed in the relevant analyzer's changelog and a new version will be released utilizing the change. This is a lot of work, most likely requiring several hours of focused study to understand what is happening in the new version. As a result, dependency updates will be divided evenly and assigned to Senior and Intermediate Backend Engineers, with the remainder going to the group's Staff Backend Engineer. Assignments will be managed through our Release project's issue template.
The assigned backend engineer is the group's primary liaison with the dependency's open source community. Engineers are expected to contribute back to those projects, especially if critical or high security findings are confirmed.
We have a dependencies group which contains mirrored copies of the OSS projects upon which we most rely. Prior to submitting an MR updating an analyzer to a new version of these projects, engineers are expected to do the following:
We do not want to ship updated dependencies which have Critical and High severity vulnerabilities in them. If we find ourselves in this situation, we will withhold updates to the dependency until the problems have been patched.
At times we will need to update our analyzers because of security updates to golang itself. In this situation, we follow the established release process.
Our users expect us to provide them with a quality experience, no matter which open source or proprietary components we include in our analyzers. They also expect our documentation to clearly outline the configurations we support so that they can make informed decisions about whether to adopt our tools for their needs.
Before we document that we support a configuration, we do validate that it works. For example, before we list a certain type of file or build configuration as a supported feature, we must have checked it at least once, however minimally.
However, we do not independently reproduce all end-to-end tests for components we rely on. Maintaining these tests independently would require unnecessary effort and would duplicate work that would be better contributed upstream if it's lacking. Instead, we aim to build tests that cover basic configurations for smoke-testing and demonstration purposes.
We may choose to document supported configurations once they're validated, even if the test coverage is not yet complete.
In general, the Static Analysis group has two sources of unplanned work: community contributions and ~severity::1 bugs. We will reserve capacity each release so we can respond quickly and efficiently. In both scenarios, we will route community contributions to the engineer who "owns" the analyzer.
We do, however, own and contribute to projects beyond the analyzers shipped as part of GitLab's product. Where possible, unplanned work requiring
the attention of an engineer in Static Analysis will be routed according to that project's CODEOWNERS
file. Otherwise, unplanned work will be
considered and handled on a case-by-base basis.
While we plan our work on a monthly basis, customers and customer-facing team members may need support on an unplanned basis. We aim to support these requests quickly because they affect the success of our customers and our business.
Generally, we aim to provide an initial response and triage the question/report as quickly as is reasonable. "Reasonable" means, for example, that team members are answering during their normal working hours and are continuing their normal work activities. Whoever is available and can contribute to a solution is encouraged to make first contact with the questioner and ask any clarifying questions—remember, you can always tag in another group member later if you're unable to resolve the question.
The aim of the triage is to support other team members in moving forward; if development work is required to address the problem, it is not automatically a top priority for the group and should not automatically displace existing planned work. If there is any question of whether a bug fix or improvement should be taken up immediately, the Engineering Manager and Product Manager should be alerted to facilitate a decision.
When a Customer Success Escalation is declared, the Engineering Manager and Product Manager should both be alerted, and an appropriate team member should be designated to deprioritize existing work and respond to the escalation as soon as possible.
(Sisense↗) We also track our backlog of issues, including past due security and infradev issues, and total open System Usability Scale (SUS) impacting issues and bugs.
(Sisense↗) MR Type labels help us report what we're working on to industry analysts in a way that's consistent across the engineering department. The dashboard below shows the trend of MR Types over time and a list of merged MRs.
(Sisense↗) Flaky test are problematic for many reasons.