Thanks for visiting this category strategy page on Metrics in GitLab. This category belongs to and is maintained by the APM group of the Monitor stage.
Please share feedback directly via email, Twitter, or on a video call. If you're a GitLab user and have direct knowledge of your Metrics usage, we'd especially love to hear your use case(s).
Metrics help users understand the health and status of your services and essential for ensuring the reliability and stability of those services., metrics normally represents by raw measurements of resource usage over time (e.g. measure memory usage every 10 second). Some metrics represent the status of an operating system (CPU, memory usage). Other types of data tied to the specific functionality of a component (requests per second, latency or error rates). The most straightforward metrics, to begin with, are those already exposed by your operating system hence easier to collect (e.g. Kubernetes metrics). For other components, especially your applications, you may have to add code or interfaces to expose the metrics you care about. Exposing metrics is sometimes known as instrumentation, the collection of metrics from an end point is called scraping.
Provide users with information about the health and performance of their infrastructure, applications, and system for insights into reliability, stability, and performance.
Metrics are essential for all users across the DevOps spectrum. From developers who should understand the performance impact of changes they are making, as well as operators responsible for keeping production services online.
Our vision is that in 1-2 years, GitLab metrics is the main day-to-day tool for monitoring cloud-native applications for SMBs.
Since the team's current focus is dogfooding metrics, our immediate target audience is the GitLab Infrastructure Team. We plan to build the minimal work needed for them to start dogfooding our metrics dashboard before shifting focus to consider the overall needs of the target audience mentioned above.
The experience today offers you to deploy Prometheus instance into a project cluster by a push of a button, the Prometheus instance will run as a GitLab managed application. Once deployed, it will automatically collect key metrics from the running application which are displayed on an out of the box dashboard. Our dashboards provide you with the needed flexability to display any metric you desire you can set up alerts, configure variables on a generic dashboard, drill into the relavant logs to troubleshoot your service and more… If you already have a running Prometheus deploy into your cluster simply connect it to your GitLab and start using our GitLab metrics dashboard.
The target workflow, listed below, is our high-level roadmap. It is based on competitive analysis, user research, and customer interviews. The details of each workflow are listed in the epics and issues.
The first step in application performance management is collecting the proper measurements or telemetry data. Instrumenting critical areas and reporting metrics of your system are prerequisites to understanding the health and performance of your services and application. Our metric solution is powered by Prometheus, targeting users of Kubernetes. We need to make sure our users can successfully
Once you've collected a set of metrics, the next step is to see those metrics in a dashboard.
Users need to be alerted on any threshold violation, ideally before their end-users
Our vision is that in 1-2 years, GitLab metrics will be the primary day-to-day tool for monitoring cloud-native applications. To achieve that we would need to support:
In the distributed nature of cloud-native applications, it is crucial and critical to collect logs across multiple services and infrastructure, present them in an aggregated view, so users could quickly search through a list of logs that originate from multiple pods and containers. Metrics and logs are related and we intend to make the correlation for users so that they can more quickly get to the answers they are looking for. You can review our logging direction page for more information.
We are actively Dogfooding GitLab Metrics with the Infrastructure team to migrate dashboards used for monitoring Gitlab.com from Grafana to GitLab Metrics. This will help us receive rapid feedback as we mature metrics. In terms of this overall roles and responsibilities:
We will iterate based on the following process:
|General SLAs||https://gitlab.com/gitlab-org/monitor/sandbox/test-metrics-dashboard/-/environments/2115574/metrics?dashboard=.gitlab%252Fdashboards%252Fgeneral-slas.yml||Tools for Engineers SLA Dashboard link||Blocked by #219726|
|Public Dashboard Landing Page||https://gitlab.com/gitlab-org/monitor/sandbox/test-metrics-dashboard/-/environments/2115574/metrics?dashboard=.gitlab%252Fdashboards%252Fpublic-dashboard-splash-screen.yml||Infrastructure KPI 6 image link||Awaiting Feedback|
|Cloudflare Traffic Overview||https://gitlab.com/gitlab-org/monitor/sandbox/test-metrics-dashboard/-/environments/2115574/metrics?dashboard=.gitlab%252Fdashboards%252Fcloudflare-traffic-overview.yml||N/A||Awaiting Feedback|
As mentioned above, adding a metric to a dashboard and adding new dashboards are basic functionalities a monitoring solution should have. Today, this can be quite confusing for a first time GitLab Metrics user. We are actively working on improving these workflows and allow a better onboarding experience for our users. Detailed information can is available in the following: