This living guide is intended to explain to users the why, when, and how of security incident response at GitLab.
If an urgent security incident has been identified or you suspect an incident may have occurred, the Security Engineer On Call can be paged by:
/security <issue description>in GitLab Slack
When paging security a new issue will be created to track the incident being reported. Please provide as much detail as possible in this issue to aid the Security Engineer On Call in their investigation of the incident. The Security Engineer On Call will respond to the page within 15 minutes and may have questions which require synchronous communication from the incident reporter. It is important when paging security to please be prepared to be available for this synchronous communication in the initial stage of the incident response.
For lower severity requests or general Q&A, GitLab Security is available in the
#Security channel in GitLab Slack and the Security Operations team can be alerted by mentioning
@sec-ops-team. If you suspect you've received a phishing email, and have not engaged with the sender, please see: What to do if you suspect an email is a phishing attack. If you have engaged a phisher by replying to an email, clicking on a link, have sent and received text messages, or have purchased goods requested by the phisher, page security as described above.
The Security Operations team is on-call 24/7/365 to assist with any security incidents. Information about SecOps on-call responsibilities and incident ownership is available in the On-Call Guide.
Security incident investigations are initiated when a security event has been detected on GitLab.com or as part of the GitLab company. These investigations are handled with the same level of urgency and priority regardless of whether it's a single user or multiple projects.
Indicators can be reported to Security Operations either internally, by a GitLab team member, or externally. It is the Security team's responsibility to determine when to investigate, dependent on the identification and verification of a security incident.
The GitLab Security team identifies security incidents as any violation, or threat of violation, of GitLab security, acceptable use or other relevant policies.
Security incidents may (and usually do) involve sensitive information related to GitLab, GitLab's customers or employees, or users who (in one way or another) have engaged with GitLab. GitLab, while codifying the Transparency value, also strongly believes in and strives to maintain the privacy and confidentiality of the data its' employees, customers, and users have entrusted us with.
A confidential issue means any data within the issue and any discussions about the issue or investigation are to be kept to GitLab employees only unless permission is explicitly granted by GitLab Legal, GitLab Security Director, or the GitLab Executive Team.
Security incident investigations must begin by opening a tracking issue in the Security Operations project and using the Incident Response template. This tracking issue will be the primary location where all work and resulting data collection will reside throughout the investigation.
All artifacts from an investigation must be handled per the Artifact Handling and Sharing internal only runbook.
NOTE: The tracking issue, any collected data, and all other engagements involved in a Security Incident must be kept strictly confidential.
Assigning severity to an incident isn't an exact science and it takes some rational concepts mixed with past experiences and gut feelings to decide how bad a situation may be. When considering severity, look at:
After taking these types of questions into consideration, review the Overall Impact to help place a severity rating on the incident.
Coordinate with internal teams and prepare for the incident investigation:
#secops_####where #### is the GitLab issue number in the Security Operations project.
In the event that an incident needs to be escalated within GitLab, the Security Engineer On Call will page the Security Incident Manager On Call (SIMOC). It is the responsibility of the SIMOC to direct response activities, gather technical resources from required teams, coordinate communication efforts with the Communications Manager On Call, and further escalate the incident as necessary.
Characteristics of an incident requiring escalation include but are not limited to the following:
Once an incident has been identified and its severity set, the incident responder must attempt to limit the damage that has already occurred and prevent any further damage from occurring.
The first step in this process is to identify impacted resources and determine a course of action to contain the incident while potentially also preserving evidence. Containment strategies will vary based on the type of incident but can be as simple as marking an issue confidential to prevent information disclosure or to block access to a network segment.
It's important to remember the containment phase is typically a stop-gap measure to limit damage and not to produce a long term fix for the underlying problem. Additionally the impact of the mitigation on the service must be weighed against the severity of the incident.
During the remediation and recovery phase the incident responder will work to ensure impacted resources are secured and prepared to return the service to the production environment. This process may involve removing malicious or illicit content, updating access controls, deploying patches and hardening systems, redeploying systems completely, or a variety of other tasks depending on the type of incident.
A Root Cause Analysis will be completed to guide the remediation and recovery process. Careful planning is required to ensure successful recovery and prevention of repeat incidents. The incident responder coordinate impacted teams to test and validate all remediations prior to deployment.
This phase should prioritize short term changes that improve the overall security of impacted systems while the full recovery process may take several months as longer term improvements are developed.
Upon completing the containment, remediation, communication and verification of impacted services, the incident will be considered resolved and the incident issues may be closed.
The incident response process will move on to a post-mortem and lessons learned phase through which the process improvements and overall security of the organization can be analyzed and strengthened.