Secure, Composition Analysis

The Composition Analysis group at GitLab is charged with developing solutions which perform Container and Dependency Scanning and License Compliance.

Composition Analysis

The Composition Analysis group at GitLab is charged with developing solutions which perform Container Scanning, and Software Composition Analysis. See the exhaustive list of projects the group maintains.

Common Links

Slack channel: #g_secure-composition-analysis
Slack alias: @secure_composition_analysis_dev
Google groups: composition-analysis-dev@gitlab.com

How we work

Workflow

The Composition Analysis group largely follows GitLab’s Engineering Workflow and Product Development Flow.

This includes:

Indicating Status and Raising Risk

We leverage the issue’s health status feature to communicate the progress of the issue.

All issues should be marked On Track at the beginning of a milestone. This is done by the Epic DRI, or the Engineering Manager for unassigned, standalone issues.

Raising risk early is important. The more time we have, the more options we have. As such, the team reviews issues every week and discusses items that Need Attention or are At Risk to possibly course correct and re-assign resources based on the team’s priorities.

Follow these steps when raising or downgrading risk:

Update the Health Status in the issue:
1. On Track - the work will be completed within the planned milestone.
2. Needs Attention - the issue is blocked or has other factors that need to be discussed.
3. At Risk - the issue is in jeopardy of missing the cutoff to ship within the planned milestone.
Add a comment about why the risk has increased or decreased.
Copy the Engineering Manager and Product Manager in a comment.

Time-off Calendar

We use a shared calendar to see when team members are off work: c_629844cc273be17e067767febe12547bc40e129f26f0f17339030bff708cd0d5@group.calendar.google.com.

Access to this calendar is granted via this google group: sec-secure-composition-analysis@gitlab.com.

To share your time-off:

In Slack, find the Time Off by Deel application under the Apps menu.
Under Home, click on Your Events to show a dropdown.
Click on ‘Calendar Sync’ under the Settings break.
Click Add calendar under Additional calendars to include?. Use the calendar ID above.

To visualize the calendar:

In Google Calendar, find the Other calendars section in the left sidebar.
Click on the ➕ icon and then select Subscribe to calendar. Use the calendar ID above.

Reaction rotation

On top of our development roadmap, engineering teams need to perform tasks related to security vulnerabilities, support, maintenance, community contributions.

To avoid excessive context-switching, and better distribute the workload, our team reserves capacity for these tasks as part of milestone planning:

Primary engineer. Fully allocated to the tasks below. They must prioritize these tasks above all other work, in the following order: Security, Support, Maintenance.
Secondary engineer. Acts as a backup in case the primary engineer has an unplanned absence or exceeds their capacity. They must prioritize requests from the primary engineer, but otherwise focus on type::bug, then type::maintenance issues.

Neither engineer should be allocated to work on Features or critical deliverables. In the context of Cross-functional milestone planning, their allocation counts towards the bugs and maintenance ratio.

The rotation schedule follows the development cycle, which means using the start/end dates from the GitLab product milestones. When creating the schedule, the Engineering Manager should aim to minimize the number of back-to-back rotations that engineers do.

Please keep track of the actions you’re doing during your rotation and add notes in the corresponding issue (e.g. copying tools command executed locally, sharing relevant changes to projects and processes, etc.)

Responsibilities - Security

Triage vulnerabilities reported on the projects we maintain and help resolving them depending on their priority. (See Security vulnerabilities triaging process)
Check for security automation failures
Check for new security releases of our dependencies and ensure we use them:
1. Upstream scanners (see Updating an upstream scanner)
2. Container base images
3. Application dependencies
4. Programming language
Refine scheduled security issues.
Consider creating or updating any automation or tooling (related to security, maintainership or support!)

Responsibilities - Support

Monitor slack channels for questions, support requests, and alerts. While other team members may respond to these requests, the engineer assigned to the reaction rotation is expected to handle them primarily. If a support engineer requests assistance via Slack and it requires investigation or debugging, they should be directed to raise an issue in a dedicated project.
Monitor Section Sec Request For Help project for support requests.
Refine scheduled bugs and maintenance issues.

These items must be triaged continuously throughout the milestone which means they must be checked multiple times a week.

Responsibilities - Maintainership

Work with community contributors to help drive their merge requests to completion (more information on community contributions triaging process).
Check for new versions of languages or package managers that we support, or deprecation / removal of support for the same and notify Engineering Manager and Product Manager via issue.
Check for new versions of our dependencies (not related to security):
1. Upstream scanners (see Updating an upstream scanner).
2. Container base images.
3. Application dependencies.
4. Programming language.
Check in on test failures. Check relevant slack channels (#g_secure-composition-analysis-alerts, #s_secure-alerts).
Check latest pipelines for any release failures. If any issue is preventing the automated release process from running, begin the release failure escalation process.
Consider creating or updating any automation or tooling (related to security, maintainership or support!).
Monitor failures and errors on license-db project, use the #f_licese_database Slack channel for communication about these items, so other team members can provide the support.
1. Check latest scheduled pipelines of license-db for any failures. Ensure that pipelines pass or create an issue to fix the failure.
2. Monitor the Slack channel #g_secure-composition-analysis-alerts for any incidents on the license-db infrastructure.
  - In case of an incident react with 👁️ to indicate that you are looking into it.
  - If the incident isn’t resolved in 30 minutes or more, investigate on it.
  - Write down in the insident Slack thread all the steps that were done to resolve it.

Security vulnerabilities triaging process

We are responsible for triaging vulnerabilities reported on 2 sets of projects: the projects maintained by GitLab and the upstream scanner software we might depend on. Though, we have different processes that apply depending on the situation.

See the Secure sub-department vulnerability management process.

View manual process fallback that is specific to Composition Analysis group

Please keep track of the commands that were executed and add them to a private note in the reaction rotation issue.

Manually reviewing and resolving vulnerabilities

On a weekly basis: review the vulnerability report to resolve no longer detected ones and close related issues. Note: It is not necessary to investigate vulnerabilities that are no longer detected.

Visit Vulnerability Report Dashboards to verify that there are vulnerabilities that can be resolved.
- Analyzer vulnerabilities that are no longer detected.
  - If you want to configure the report manually, select all shared, container scanning, and dependency scanning projects, and apply the No longer detected activity filter and apply the Confirmed and Needs Triage status.
- License-db Vulnerability that are no longer detected
  - If you want to configure the report manually, select all license-db projects, and apply the No longer detected activity filter and apply the Confirmed and Needs Triage status.
Execute the security-triage-automation tool to resolve vulnerabilities and close their issues. This tool must be executed separately for each of the projects in the following categories (if there are vulnerabilities to resolve):
Verify in Vulnerability Report Dashboards that vulnerabilities have been resolved.

Manually creating security issues for FedRAMP vulnerabilities

Follow the Secure sub-department process on manually creating security issues for FedRAMP vulnerabilities for each of these projects:

container scanning
dependency scanning

Manually creating deviation requests for FedRAMP vulnerabilities

Follow the Secure sub-department process on manually creating deviation requests for FedRAMP vulnerabilities for each of the vulnerabilities near SLA breach.

Security Policy

We prioritize findings by their CVSS severities and SLAs, and currently focus on security findings with these severity levels:

Critical
High

An exception is made for Container scanning findings - we focus only on findings with Critical severity.

Please utilize all the time you have set aside. If you complete all the ones at Critical and High, please continue to triage - we want to address all findings but we are working in a risk based order.

Triaging vulnerabilities

We use the Vulnerability Report with filters to focus on items matching our policy and reported on the relevant projects.

Analyzers Vulnerability Report
- To configure the report manually, select all shared, container scanning, and dependency scanning projects and apply the Still detected activity filter and apply the Needs Triage status.
License-db Vulnerability Report
- To configure the report manually, select all license-db projects and apply the Still detected activity filter and apply the Needs Triage status.

For each item, investigate and either dismiss or confirm it. If it’s not clear whether there’s indeed a threat, escalate to our Application Security team.

Refer to Vulnerability status definitions in case you are unsure of what each of them mean.

Upstream scanners vulnerabilities

This only applies to projects NOT maintained by GitLab.

We review vulnerabilities detected on upstream scanners when upgrading to a newer version. See the Security checks when updating an upstream scanner section.

We currently have a limited capacity to triage vulnerabilities reported on our upstream scanners. Continuously triaging vulnerabilities reported for these projects is done on a best effort basis.

Triaging vulnerabilities

We use the Vulnerability Report with filters to focus on items matching our policy and reported on the relevant projects.

Upstream Scanners Vulnerability Report
- To configure the report manually, select all upstream scanner projects.

For vulnerabilities discovered in upstream scanners, an issue must be created in GitLab’s issue tracker, and we should work with the relevant Open Source community to help provide a resolution. As a last resort, we can patch locally or fork the upstream project temporarily to fix the vulnerability sooner.

Dismissing a vulnerability

When there is no doubt a vulnerability is a false-positive, it can be “Dismissed” unless it related to a FedRAMP image (fips). Select the “Dismiss” option from the vulnerability status options. Finally, make sure to comment on the vulnerability status change notification to explain why.

Low risk findings that can be dismissed

Because of both the way severity is generically set in CVSS and automated scanners do not have all context for an application, many findings which may be high risk in other environments or scenarios are low risk for our users. The containers ingest code from a user project and that user has developer access, and the containers are ephemeral and related to a specific pipeline.

In some other cases, a finding is related to an upstream dependency or Operating System and there is no fix available and no fix planned. Please be sure to mark this issues using the labels; blocked or blocked upstream.

When an issue is both blocked for a few releases and low risk you may dismiss the finding with a note as to the reasoning. If there is an open issue notify the Application Security team with your specific reasoning and close the issue (if applicable). In the future we will specifically want to tag everything related to these findings as won’t fix or blocked when they are being closed, for now that is only available on issues and not findings.

The following class of container scan vulnerabilities can be considered low risk:

Many kernel-related findings will be at a decrease of risk and hence severity because of the way our process works with temporary containers with limited inputs which are developer-controlled.
Issues related to a software stack that will not apply to the analyzer e.g GUI related issues, issues in Bluetooth drivers, browser-related issues which require browser running in non-headless mode, etc.
S3 or S4 findings with complex exploit method or limited risk which have no fix available, or the upstream has stated there are no plans to release a patch.
Denial of Service (of the container/analyzer) as these containers run in ephemeral pipelines, are automatically stopped once a timeout is reached, and are accepting in code from users who already have developer access. This as a result is not an expansion of the risk profile.
Random number generator issues (where the numbers are not random) as we don’t use random numbers for security purposes from the containers. (At the time this was last updated these were true, please use your knowledge of our analyzers or ask if unsure)"

To add items to the list above discuss repeatable finding patterns with Application Security, get approval from a leader in the security section, and add to this list.

Confirming a vulnerability

If the vulnerability impacts a dependency:

Evaluate if the dependency (software library, system library, base image, etc.) can be upgraded or removed.
Set the vulnerability status to “Confirmed”.
Release a new version of the analyzer with the dependency upgrade/removal and follow the process on resolving a vulnerability.

For all other confirmed vulnerabilities, create a security issue to discuss and track the remediation.

Resolving a vulnerability

When a vulnerability has been remediated, it can be “Resolved”. When doing so, comment how it was remediated, then select the “Resolve” option from the vulnerability status options, and close the related vulnerability issue.

Creating security issues

Unfortunately, creating a security issue can’t be done yet via the “create issue” button from the vulnerability page or security dashboard as this only works when creating an issue in the same project where the error was reported and we’ve disabled the embedded issue tracker in our projects.

Instead, in our workflow we open all our issues in the main GitLab project.

As a workaround, you can copy and paste the content of the vulnerability page (this keeps markdown formatting!). Please also follow our Security guidelines about creating new security issues.

You can leverage quick actions to add the necessary labels.

/confidential

/label ~security ~"type::bug" ~"bug::vulnerability"
/label ~"section::sec" ~"devops::secure" ~"group::composition analysis"

<!-- depending on the affected project: -->
/label ~"Category:Software Composition Analysis"
/label ~"Category:Container Scanning"

It’s important to add the ~security and ~"bug::vulnerability" labels as described above, because the AppSec Escalation Engine will automatically pick up any issues with these labels and add additional labels ~security-sp-label-missing and ~security-triage-appsec as well as mention the issue in the #sec-appsec Slack channel. At this point, the Stable Counterpart or Application Security team triage person will pick up the issue and assign a severity as part of the appsec triage rotation.

Once the issue is created, please add it to the vulnerability’s linked items for ease of tracking.

Developers reporting the security issue should help the Application Security team assess the impact of the vulnerability, and update the issue description with an Impact section.

If immediate feedback is required, then add a comment to the vulnerability issue with an @-mention directed at one of the Security Engineers listed in the Stable Counterpart section, or ping them on slack.

Release failure process

If the image release process is failing, an incident should be created to track how it was detected, escalated, and resolved. Documenting our incidents makes it possible to search for previous incidents by keyword, labels, and other issue filters. We open all of our incidents in the main GitLab project.

Open a new incident and add a description of the problem along with any reproduction steps. Add the following labels so that we can track the incidents that have impacted composition analysis in the future.

<!--
Select one of the following severities
Ref: https://about.gitlab.com/handbook/engineering/infrastructure/engineering-productivity/issue-triage/#severity
-->
/label ~"severity::1"
/severity S1

/label ~"severity::2"
/severity S2

/label ~"severity::3"
/severity S3

/label ~"severity::4"
/severity S4

<!--
Select one of the following priorities
Ref: https://about.gitlab.com/handbook/engineering/infrastructure/engineering-productivity/issue-triage/#priority
-->
/label ~"priority::1"
/label ~"priority::2"
/label ~"priority::3"
/label ~"priority::4"

/label ~"section::sec" ~"devops::secure" ~"group::composition analysis" ~"type::bug" ~"bug::availability"

<!--
Select one of the following categories
-->
/label ~"Category:Dependency Scanning"
/label ~"Category:Container Scanning"
/label ~"Category:License Compliance"

Assign the incident to the engineer currently on the maintainership reaction rotation.
Link any related issues or zoom meetings with the quick actions to record incident timeline events. Ensure that an event exists for the incident start, detection, resolution, and any other events that you feel are worth highlighting as part of the incident response.
Upon fixing the issue, include a detailed summary of the resolution and any initial follow up actions that should be completed. Lastly, an entry for incident should be added to the weekly composition analysis group meeting so that it may be reviewed with the entire group.

Example Incident(s)

PHP Composer segfaults in gemnasium analyzer

Maintenance triaging process

To help our Product Manager prioritize maintenance issues, the engineering team assigns them a priority label.

Leverage the Maintenance issues board.
For each open issue that has no Priority label (“Open” column), shortly investigate the issue (< 1h) and comment with your findings. Make sure the correct sub-category label is applied per our Work type clasification (e.g. ~maintenance::refactor).

Code review

Upon joining Composition Analysis group, team members are suppose to become either reviewers or maintainers for all projects maintained by the group. The process how to become maintainer is described in the general Code review guidelines.

Projects

The Composition Analysis group maintains several projects to provide our scanning capabilities.

Shared

common library

Dependency Scanning

Additional notes:

gemnasium-db is maintained by the Vulnerability Research group.

Container Scanning

container-scanning analyzer
Cluster Image Scanning related code, needed for Operational Container Scanning feature.

License-db

Operational Container Scanning

The OCS module is part of the gitlab-agent project which is maintained by the Environments group. The Composition Analysis group is responsible for maintaining only the OCS module.

Semver dialects gem

semver_dialects

Upstream scanner mirrors

As some of our analyzers rely on open source software, we include them in our security testing to increase coverage and reduce risk.

To do so, we mirror their repository and execute our security scans on them (when relevant):

The vulnerabilities reported on the currently used version of the scanner are automatically reported in the group level Vulnerability Report and triaged as part of our security vulnerabilities triaging process.

Setting up a mirror

create a new project in https://gitlab.com/gitlab-org/security-products/dependencies (blank project).
set up the project repository as a pull mirror of the upstream repository.
find the git tag that matches the version currently used by our analyzer (usually represented by the SCANNER_VERSION variable in the analyzer’s Dockerfile). Use exact commit if there is no git tag for the corresponding release we use.
create a branch from that ref following naming convention VERSION-security-checks where VERSION is the version of the upstream scanner we currently use (e.g. v6.12.0).
add a .gitlab-ci.yml configuration file to configure all compatible security scans.
make VERSION-security-checks the default branch, so that reported vulnerabilities are showing up on the dashboards and vulnerability reports.

Updating an upstream scanner

We check for new releases of the upstream scanners on a monthly basis, as part of our release issue. When an update is available, a new issue is created using the update scanner issue template and added to the next milestone.

Every analyzer relying on an upstream scanner has a “How to update the upstream Scanner” section in their readme detailing the process. This includes a verification for possible new security vulnerabilities and a license check which are detailed below.

Security checks when updating an upstream scanner

Before releasing an analyzer with a newer version of its upstream scanner, we must ensure it is exempt of security vulnerabilities matching our current policy.

checkout the new tag (or commit) and create a new branch from it following naming convention NEW_VERSION-security-checks.
copy/paste the existing .gitlab-ci.yml configuration file from the current VERSION-security-check branch.
if there are new findings matching our policy, address them according to our triage process.
only when above mentionned findings are fixed, update the default_branch to be NEW_VERSION-security-checks and proceed with the update of the analyzer to use this newer version.

License check when updating an upstream scanner

Before releasing an analyzer with a newer version of its upstream scanner, we must ensure its license has not changed or is still compatible with our policy.

Dashboards

Monitoring

Stage Group dashboad on Grafana

Last modified April 9, 2024: Add security automation failure documentation (7ce1402d)

View page source - Edit this page - please contribute.