GitLab recognizes that the Handbook is a critical part of empowering team members to do their jobs effectively. As such we have implemented a basic on-call process (refer to First-response Service Level Objective below) to ensure that someone is available to assist team members in the event that something is broken in the handbook or if they are having trouble with making updates to it.
Any issues should be reported in the #handbook-escalation channel in Slack.
Issues should only be escalated to the Handbook On-Call team if it relates to:
All team members from the Static Site Editor team are part of the on-call process and are members of the #handbook-escalation channel. Additionally any GitLab team member can volunteer to join the #handbook-escalation channel and help out.
We are not implementing a schedule roster for being on-call initially, but we will consider doing so if the need arises.
#is-this-knownto see if it's a know issue with infrastructure or other problems.
The Handbook On-Call deals specifically with matters relating to the
www-gitlab-com repo source code and configuration.
If a reported issue relates to the GitLab product or the infrastructure running the https://about.gitlab.com website then it should be escalated to the Reliability Engineering team.
To report an incident follow the instructions on the Incident Management page: /handbook/engineering/infrastructure/incident-management/#reporting-an-incident
All incidents reported in the #handbook-escalation channel, during weekdays (Mon - Fri, 08:00 UTC+0 - 18:00 UTC-7), should receive an initial response of acknowledgement within 1 hour of it being reported.
There is also a runbook for about.gitlab.com incident handling.
All broken CI pipelines for the
master branch of the
www-gitlab-com repo are automatically posted in the Slack channel.
These reports should be investigated and addressed where needed.
Once a report has been looked at, please leave a comment stating the nature of the problem, action taken and add a ✅ reaction to the message to show that it has been handled.
If for some reason there is a large amount of failures resulting in spamming the channel, the error reporting can be turned off in the repo settings: https://gitlab.com/gitlab-com/www-gitlab-com/-/services/slack/edit
To see the status of the merge train (useful when team members are reporting that their MRs seem 'stuck' on the train) you can open the following link: https://gitlab.com/api/v4/projects/7764/merge_trains?scope=active&per_page=100. You can search for and count the number of occurrences of
iid to get an idea how many total MRs are currently in the train (up to the
per_page parameter max limit of
This problem is being addressed in this issue and a workaround is documented as well in https://gitlab.com/gitlab-org/gitlab/-/issues/217908#when-the-merge-train-in-the-www-gitlab-com-project-might-be-stuck
See also the issue for looking into the backend Gitaly performance problems related to this, specifically links to APDEX charts.