Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Monitor Stage

Groups

The groups within this stage are:

Vision

Using GitLab, you automatically get broad and deep insight into the health of your deployment.

Mission

We provide a robust monitoring solution to give GitLab users insight into the performance and availability of their deployments and alert them to problems as soon as they arise. We provide data that is easy to digest and to relate to other features in GitLab. With every piece of the devops lifecycle integrated into GitLab, we have a unique opportunity to closely tie our monitoring features to all of the other pieces of the devops flow.

We work collaboratively and transparently and we will contribute as much of our work as possible back to the open source community.

Responsibilities

The monitoring team is responsible for:

This team maps to Monitor Stage.

How to work with Monitor

Surfacing blockers

To surface blockers, mention your Engineering Manager in the issues, and then contact them via slack and or 1:1's. Also make sure to raise any blockers in your daily async standup using Geekbot.

The engineering managers want to make unblocking their teams their highest priority. Please don't hesitate to raise blockers

Scheduling issues in milestones

The Product Manager is responsible for scheduling issues in a given milestone. During the backlog grooming portion of our weekly meeting, all parties will make sure that issues are scoped and well-defined enough to implement and whether they need UX involvement and/or technical investigation.

As we approach the start of the milestone, Engineering Managers are responsible for adding the ~deliverable label to communicate which issues we are committing to finish in the given milestone. Generally, the Engineering Manager will use the prioritized order of issues in the milestone to determine which issues to label as ~deliverable. The Product Manager will have follow-up conversations with the Engineering Managers if the deliverables do not meet their expectations or if there are other tradeoffs we should make.

Scheduling bugs

When new bugs are reported, the engineering managers ensure that they have proper Priority and Severity labels. Bugs are discussed during our backlog grooming session and are scheduled according to severity, priority, and the capacity of the teams. Ideally, we should work on a few bugs each release regardless of priority or severity.

Weekly async issue updates

Every Friday, each engineer is expected to provide a quick async issue update by commenting on their assigned issues using the following template:

<!---
Please be sure to update the workflow labels of your issue to one of the following (that best describes the status)"
- ~"workflow::In dev"
- ~"workflow::In review"
- ~"workflow::verification"
- ~"workflow::blocked"
-->
### Async issue update
1. Please provide a quick summary of the current status (one sentence).
1. When do you predict this feature to be ready for maintainer review?
1. Are there any opportunities to further break the issue or merge request into smaller pieces (if applicable)?

We do this to encourage our team to be more async in collaboration and to allow the community and other team members to know the progress of issues that we are actively working on.

Interacting with community contributors

Community contributions are encouraged and prioritized at GitLab. Please check out the Contribute page on our website for guidelines on contributing to GitLab overall.

Within the Monitor stage, Product Management will assist a community member with questions regarding priority and scope. If a community member has technical questions on implementation, Engineering Managers will connect them with engineers within the team to collaborate with.

Using spikes to inform design decisions

Engineers use spikes to conduct research, prototyping, and investigation to gain knowledge necessary to reduce the risk of a technical approach, better understand a requirement, or increase the reliability of a story estimate (paraphrased from this overview). When we identify the need for a spike for a given issue, we will create a new issue, conduct the spike, and document the findings in the spike issue. We then link to the spike and summarize the key decisions in the original issue.

Preparing UX designs for engineering

Product designers generally try to work one milestone ahead of the engineers, to ensure scope is defined and agreed upon before engineering starts work. So, for example, if engineering is planning on getting started on an issue in 12.2, designers will assign themselves the appropriate issues during 12.1, making sure everything is ready to go before 12.2 starts.

To make sure this happens, early planning is necessary. In the example above, for instance, we'd need to know by the end of 12.0 what will be needed for 12.2 so that we can work on it during 12.1. This takes a lot of coordination between UX and the PMs. We can (and often do) try to pick up smaller things as they come up and in cases where priorities change. But, generally, we have a set of assigned tasks for each milestone in place by the time the milestone starts so anything we take on will be in addition to those existing tasks and dependent on additional capacity.

The current workflow:

Repos we own or use

Service accounts we own or use

Zoom sandbox account

In order to develop and test Zoom features for the integration with GitLab we now have our own Zoom sandbox account.

Requesting access

To request access to this Zoom sandbox account please open an issue providing your non-GitLab email address (which can already be associated an existing non-GitLab Zoom account).

The following people are owners of this account and can grant access to other GitLabbers:

Granting access

  1. Log in to Zoom with your non-GitLab email
  2. Go to User Management > Users
  3. Click on Add User
  4. Specify email addresses
  5. Choose User Type - most likely Pro
  6. Click Add - the users receive invitations via email
  7. Add the linked name to the list in "Requesting access"

Documentation

For more information on how to use Zoom see theirs guides and API reference.

Async Daily Standups

We use the geekbot slack plugin to automate our async standup, following the guidelines outlined in the Geekbot commands guide. Answers are concise and focused on top priority items. All question prompts are optional and only answered when the information should be surfaced to the team:

Recurring Meetings

Every-other week we have a Monitor Stage Demo Hour for engineering and design demos by members of the Monitor Stage group. Demos are voluntary and on a sign-up basis.

There is also an optional Monitor Social Hour meeting every week. This call has no agenda and alternates times every other week to be more inclusive of team members in different time zones.

The Health and APM groups have their own regular meetings as well.

Retrospective

We follow the same retrospective process as the rest of the engineering department, which can be found here.

To encourage a more iterative retrospective process, we create a new retrospective issue at the beginning of each milestone, using the Monitor retrospective template. We leave this issue open for the duration of the milestone so any team member can add feedback as it happens instead of waiting until the end of the milestone.

Monitor Stage PTO

Just like the rest of the company, we use PTO Ninja to track when team members are traveling, attending conferences, and taking time off. The easiest way to see who has upcoming PTO is to run the /ninja whosout command in the #g_monitor_standup slack channel. This will show you the upcoming PTO for everyone in that channel.

Useful Resources