Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Product Direction - Monitor

On this page

This is the product direction for Monitor. If you'd like to discuss this direction directly with the product managers for Monitor, feel free to reach out to Sarah Waldner (Senior PM of the Monitor Group) (GitLab, Email Zoom call) or Kevin Chu (Group PM of Monitor) (GitLab, Email Zoom call).

MANAGE SECURE PLAN RELEASE PACKAGE DEV OPS CREATE VERIFY CONFIGURE PROTECT MONITOR
Monitor

Mission

The mission of the GitLab Monitor stage is to provide feedback on the health and performance of your system so that you can decrease the frequency and severity of incidents.

Landscape

The Monitor stage directly competes in several markets defined within our Ops Section, including Application Performance Monitoring (APM), Log Management, Infrastructure Monitoring, IT Service Management (ITSM), Digital Experience Management (DEM) and Product Analytics. The total addressable market for the Monitor stage is projected to be $2.7 billion by 2024.

All of these markets are well-established and crowded, with winning companies achieving spectacular growth as businesses continue to shift online.

Successful vendors, such as market leader Datadog are leveraging a platform strategy to expand their markets (see DataDog's acquisition of Undefined Labs to expand beyond production applications to provide code insights during development, or their expansion to incident management in 2020). Competition among market leaders today is also geared toward making the whole stack observable for enterprises. New Relic's updated business model reflects the need for vendors to capture increasing footprint (and spend) of enterprises while enabling future growth by making a significant part of their business free.

Vision

The vision of the Monitor stage is to enable DevOps team to operate their application by enabling monitoring, observability, incident response, and feedback all within a single application, GitLab.

Strategy

To achieve our vision, our strategy is to:

  1. Focus first on user adoption and dogfooding of Incident Management
  2. Strengthen bi-directional product tie-in to other GitLab stage capabilities
  3. Build a boring Monitor/Observability solution that enables customers to start using GitLab and move away from expensive Monitoring vendors

Opportunities

  1. Instrumentation is commoditized. GitLab will not need to invest in agents since OpenTelemetry and most vendor agents are all open source.
  2. With development shifting cloud-native and the massive community driven investment in tools and patterns, the opportunity to build boring solutions on top of the cloud-native solutions plays right to GitLab's strength.
  3. Out-of-the-box monitoring capabilities saves time and money and lowers the bar on the expertise required for enterprises and start-ups. The ease by which most users can start monitoring their service, using established vendors, such as DataDog or New Relic, and newer competitors like Honeycomb, is something we should strive to emulate.
  4. Monitoring is traditionally for production, there are opportunities to shift monitoring tools and techniques left.
  5. Monitoring vendors are considered, in general, to be expensive. We can compete on cost when monitoring is part of the GitLab single application.

Challenges

  1. Monitoring vendors offer generous free tiers (e.g. here and here) for smaller companies and complete solutions for enterprises.
  2. Huge investments are made by market leaders. Market leaders are also expanding the scope of their solution. This makes them more sticky with their customers.

What's next

The Monitor surface area is large. Rather than continue to pursue bringing multiple products within the monitor purview to market concurrently, GitLab has consolidated the majority of its current focus to Incident Management. This allows us to complete the smart feedback loop within a single DevOps platform as a first priority. With GitLab Incident Management's development timeline, our users will benefit from the advantage of enabling collaboration for incident response within the same tool as their source code management, CI/CD, plan, and release workflows - all within the same tool. This most effectively positions GitLab to gain market traction and user adoption.

The Monitor stage's goals from 2021-04 through 2021-07 are the following:

  1. Mature the Incident Management category so that the GitLab SRE team can dogfood it
  2. Continue to grow usage of Incident Management category
    • Grow estimated SMAU for Monitor to 15,000 users

You can see our entire public backlog for Monitor at this link; filtering by labels or milestones will allow you to explore.

Changing direction on Monitoring and Observability

GitLab users can currently monitor their services and application by leveraging GitLab to install Prometheus to a GitLab managed cluster. Similarly, users can also install the ELK stack to do log aggregation and management. The advantage of using GitLab with these popular tools is users can collaborate on monitoring in the same application they use for building and deploying their services and applications.

What we've learned since that makes this particular strategy challenging are the following:

  1. Not working by default - GitLab has to first manage the cluster, get users to install additional applications, setup Prometheus exporters, before being able to see a chart. Compared this to a vendor that has an agent that auto instruments an application, the high bar is a barrier for adoption.
  2. Mostly self-service - Users are responsible for managing, scaling, and operating their Prometheus fleet, and ELK stack in order to use GitLab metrics and logging. Not having to manage and pay for the associated infrastructure and people are some main reasons organization outsource these tasks to vendors. When an organization chooses to manage monitoring on their own, many are perfectly happy just using the open source tools on their own, without GitLab as an additional layer that does not provide additional value currently.
  3. Wrong market - We targeted SMBs to use our tools as a cheaper solution relative to other vendors. The problem is the total cost of ownership was not necessarily lower. Furthermore, since GitLab's solution was based on having GitLab manage the customer's Kubernetes cluster, and there wasn't necessarily a significantly large overlap between SMB customers and those that used Kubernetes, it meant our solutions was constrained to a smaller target audience.

We are intentionally shifting our strategy to account for what we learned:

  1. Start with a SaaS only solution first.
  2. Leverage OpenTelemetry and potentially other open-source agents for auto-instrumentation. Potentially leverage the GitLab Kubernetes Agent to setup exporters.
  3. Lean into having an integrated monitoring/observability tool all within GitLab.

We will have limited bandwidth to make progress in this space within our development groups. In FY22, we are creating a Single-Engineer-Group to help us kickstart this effort.

Pricing

Monitor is a critical component for all software development and operations. The Monitor stage's tier strategy will be broken down by workflow as described below.

Free

To execute our land and expand strategy and to receive as much feedback from our potential user base, Free contains the vast majority of the Monitor features, including metrics, logs, incident management, traces, and error management.

Limits:

Premium

Upcoming premium Monitor functionality include:

Ultimate

Upcoming ultimate Monitor functionality include:

Performance Indicators (PIs)

Our Key Performance Indicator for the Monitor stage is the Monitor SMAU (stage monthly active users).

As of December 2020, the Monitor SMAU is based on the Incident Management category. It is the count of unique users that interact with alerts and incidents. This PI will inform us if we are on the right path to provide meaningful incident response tools.

Workflows

There are a few workflows that are critical to our users in this stage.

Each of these workflows has a designated level of maturity; you can read more about our category maturity model to help you decide which categories you want to start using and when.

Monitoring - Instrument

This workflow is planned, but not yet available.
Direction

Monitoring - Triage

Starting with the highest level alert, using preconfigured dashboards to review relevant metrics, enabling ad-hoc visualization and immediate drill down from time sliced metrics into logs and traces in the same screen This workflow is planned, but not yet available.

Direction

Monitoring - Resolve

This workflow is planned, but not yet available.
DocumentationDirection

Monitoring - Improve

This workflow is planned, but not yet available.
Direction

Categories

There are a few product categories that are critical for success here; each one is intended to represent what you might find as an entire product out in the market. We want our single application to solve the important problems solved by other tools in this space - if you see an opportunity where we can deliver a specific solution that would be enough for you to switch over to GitLab, please reach out to the PM for this stage and let us know.

Each of these categories has a designated level of maturity; you can read more about our category maturity model to help you decide which categories you want to start using and when.

Runbooks

Runbooks are a collection of documented procedures that explain how to carry out a particular process, be it starting, stopping, debugging, or troubleshooting a particular system. Executable runbooks allow operators to execute pre-written code blocks or database queries against a given environment. This category is at the "minimal" level of maturity.

Priority: low • DocumentationDirection

Metrics

GitLab collects and displays performance metrics for deployed apps, leveraging Prometheus. Developers can determine the impact of a merge and keep an eye on their production systems, without leaving GitLab. This category is at the "viable" level of maturity.

Priority: high • DocumentationDirection

Incident Management

Track incidents within GitLab, providing a consolidated location to understand the who, what, when, and where of the incident. Define service level objectives and error budgets, to achieve the desired balance of velocity and stability. This category is at the "viable" level of maturity.

Priority: high • DocumentationDirection

Logging

GitLab makes it easy to view the logs distributed across multiple pods and services using log aggregation with Elastic Stack. Once Elastic Stack is enabled, you can view your aggregated Kubernetes logs across multiple services and infrastructure, go back in time, conduct infinite scroll, and search through your application logs from within the GitLab UI itself. This category is at the "viable" level of maturity.

Priority: medium • DocumentationDirection

Tracing

Tracing provides insight into the performance and health of a deployed application, tracking each function or microservice which handles a given request. This makes it easy to understand the end-to-end flow of a request, regardless of whether you are using a monolithic or distributed system. This category is at the "minimal" level of maturity.

Priority: medium • DocumentationDirection

GitLab Self-Monitoring

Self-managed GitLab instances come out of the box with great observability tools, reducing the time and effort required to maintain a GitLab instance.

Priority: low • DocumentationDirection

Error Tracking

Error tracking allows developers to easily discover and view the errors that their application may be generating. By surfacing error information where the code is being developed, efficiency and awareness can be increased. This category is at the "viable" level of maturity.

Priority: low • DocumentationDirection

Synthetic Monitoring

Proactively simulate, monitor, and report on success rates and executions for user actions and behavior pathways. This category is planned, but not yet available.

Priority: high • Direction

Product Analytics

This category is at the "minimal" level of maturity.
Priority: medium • Documentation

Upcoming Releases

13.10 (2021-03-22)

13.11 (2021-04-22)

14.0 (2021-05-22)

Other Interesting Items

There are a number of other issues that we've identified as being interesting that we are potentially thinking about, but do not currently have planned by setting a milestone for delivery. Some are good ideas we want to do, but don't yet know when; some we may never get around to, some may be replaced by another idea, and some are just waiting for that right spark of inspiration to turn them into something special.

Remember that at GitLab, everyone can contribute! This is one of our fundamental values and something we truly believe in, so if you have feedback on any of these items you're more than welcome to jump into the discussion. Our vision and product are truly something we build together!

Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license