Analytics Instrumentation Guide

Analytics Instrumentation Overview

At GitLab, we collect product usage data for the purpose of helping us build a better product. Data helps GitLab understand which parts of the product need improvement and which features we should build next. Product usage data also helps our team better understand the reasons why people use GitLab. With this knowledge we are able to make better product decisions.

There are several stages and teams involved to go from collecting data to making it useful for our internal teams and customers.

Stage Description DRI Support Teams
Privacy Settings The implementation of our Privacy Policy including data classification, data access, and user settings to control what data is shared with GitLab. Analytics Instrumentation Legal, Data
Collection The data collection tools used across all GitLab applications including GitLab SaaS, GitLab self-managed, CustomerDot, VersionDot, and about.gitlab.com. Our current tooling includes Snowplow, Service Ping, and Google Analytics. Analytics Instrumentation Infrastructure
Extraction The data extraction tools used to extract data from Product, Infrastructure, Enterprise Apps data sources. Our current tooling includes Stitch, Fivetran, and Custom. Data
Loading The data loading tools used to extract data from Product, Infrastructure, Enterprise Apps data sources and to load them into our data warehouse. Our current tooling includes Stitch, Fivetran, and Custom. Data
Orchestration The orchestration of extraction and loading tooling to move data from sources into the Enterprise Data Warehouse. Our current tooling includes Airflow. Data
Storage The Enterprise Data Warehouse (EDW) which is the single source of truth for GitLab’s corporate data, performance analytics, and enterprise-wide data such as Key Performance Indicators. Our current EDW is built on Snowflake. Data
Transformation The transformation and modelling of data in the Enterprise Data Warehouse in preparation for data analysis. Our current tooling is dbt and Python scripts. Data Analytics Instrumentation
Analysis The analysis of data in the Enterprsie Data Warehouse using a querying and visualization tool. Our current tooling is Tableau. Data, Product Data Analysis Analytics Instrumentation

Editable source file

Resource Description
Getting started with Analytics Instrumentation The guide covering implementation and usage of Analytics Instrumentation tools
Metrics Dictionary A SSoT for all collected metrics and events
Privacy Policy Our privacy policy outlining what data we collect and how we handle it
Product Usage Data Privacy Policy Our privacy policy outlining product usage data we collect and how we handle it
Analytics Instrumentation Direction The roadmap for Analytics Instrumentation at GitLab
Analytics Instrumentation Development Process The development process for the Analytics Instrumentation groups

2024-05-16: last page update


Our Commitment to Individual User Privacy in relation to Service Usage Data
Our Commitment to Individual User Privacy in relation to Service Usage Data While there are examples of data collection used for malicious intent, data collection and analysis has also allowed companies to improve their product or service, benefiting their end user/consumer. It is in this vein, that GitLab collects usage data about its products. We collect individual usage data in a pseudonymized manner at the namespace level and then use this information to power our product decisions and improve GitLab for you.