Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Monitor Stage

Monitor Stage

On this page

Groups

The groups within this stage are:

Vision

Using GitLab, you automatically get broad and deep insight into the health of your deployment.

Mission

We provide a robust monitoring solution to give GitLab users insight into the performance and availability of their deployments and alert them to problems as soon as they arise. We provide data that is easy to digest and to relate to other features in GitLab. With every piece of the devops lifecycle integrated into GitLab, we have a unique opportunity to closely tie our monitoring features to all of the other pieces of the devops flow.

We work collaboratively and transparently and we will contribute as much of our work as possible back to the open source community.

Responsibilities

The monitoring team is responsible for:

This team maps to Monitor Stage.

How to work with Monitor

Adding new metrics to GitLab

The Monitor Stage is responsible for providing the underlying libraries and tools to enable GitLab team-members to instrument their code. When adding new metrics, we need to consider a few facets: the impact on GitLab.com, customer deployments, and whether any default alerting rules should be provided.

Recommended process for adding new metrics:

  1. Open an issue in the desired project outlining the new metrics desired
  2. Label with the ~Monitoring label, and ping @gl-monitoring for initial review
  3. During implementation consider:
  4. The Prometheus naming and instrumentation guidelines
  5. Impact on cardinality and performance of Prometheus
  6. Whether any alerts should be created
  7. Assign to an available Monitor Stage reviewer

Repos we own or use

Async Daily Standups

The purpose of our async standups is to allow every team member to have insight into what everyone else is doing and whether anyone is blocked and could use help. This should not be an exhaustive list of all of your tasks for the day, but rather a summary of the major deliverable you are hoping to achieve. All question prompts are optional. We use the geekbot slack plugin to automate our async standup in the #g_monitor channel.

Recurring Meetings

While we try to keep our process pretty light on meetings, we do have a few recurring meetings to keep in sync and to keep our backlog in good shape. We hold the Monitor Stage Weekly Meeting to discuss agenda items that have been added over the course of the week and to walk through our current issue board together. We also hold a Monitor Backlog Grooming meeting weekly to triage and prioritize new issues, discuss our upcoming issues, and uncover any unknowns. Both meetings are held on Thursday.

There is also an optional Monitor Social Hour meeting every week. This call has no agenda and alternates times every other week to be more inclusive of team members in different time zones.

Monitor Stage PTO

Just like the rest of the company, we use PTO Ninja to track when team members are traveling, attending conferences, and taking time off. The easiest way to see who has upcoming PTO is to run the /ninja whosout command in the #g_monitor_standup slack channel. This will show you the upcoming PTO for everyone in that channel.