xMAU is a single term to capture the various levels at which we capture Monthly Active Usage (MAU). xMAU encompasses Action (AMAU), Group (GMAU), Stage (SMAU), and Combined (CMAU, duplicated user count across all Stages in a Section or across all Stages in the product), and Total (UMAU, unique user count). In order to provide a useful single metric for product groups which maps well to product-wide Key Performance indicators, each xMAU metric cascades upwards in the order noted above.
xMAU metrics are derived from Service Ping (installation-level granularity) and GitLab.com Postgres replica (gitlab.com db event-level granularity). This workflow enables the analysis of each level of xMAU metric across various segments of customers and sets the foundation for reporting on Reported, Estimated, and Predicted metrics.
Goal of this page
Explanations for the metrics below can be found on the Product Team Performance Indicator page:
Each metric has three different versions (Recorded, Estimated, Predicted), explained on the
Self-Managed and total SaaS xMAU are calculated using Service Ping, and paid SaaS xMAU is calculated using the gitlab.com db replica in Snowflake. Product Managers choose one specific Service Ping metric that they consider to be representative of using the given stage or group, and that metric is used to produce xMAU charts.
The current SSOT for the metric-to-xMAU mapping is the performance_indicator_type field of
the Service Ping metric .yml files, which are linked in the Service Ping Metrics Dictionary.
Updates to performance_indicator_type for a specific metric will propagate downstream to the
xMAU charts in Sisense and the internal handbook PI pages.
There should be a 1-1 mapping of Service Ping metrics to xMAU. We cannot dedupe users across distinct metrics, so multiple metrics mapped to a single group's GMAU, stage's SMAU, etc will lead to double-counting.
The Product Intelligence group maintains the Service Ping Metric Dictionary, in addition to the following related documentation:
For every GitLab installation (self-managed and SaaS/GitLab.com), we use the last ping generated during the reporting period (i.e., calendar month) to calculate xMAU. Installations are randomly assigned a day of week to generate service pings, but that assignment is persistent over time. For example, if an installation is assigned Tuesdays to generate pings, it will always generate pings on Tuesdays. Since the day of week that pings are generated differs across installations, the exact date range captured in a 28-day counter will also differ. The "last ping of the month" methodology was updated in the TD xMAU 2.0 project to use the last ping created in the calendar month.
For paid SaaS xMAU, we use the last 28 days of the calendar month. More about the difference between Service Ping-generated xMAU (Self-Managed and Total SaaS) and paid SaaS xMAU. below.
Paid xMAU is defined as Monthly Active Users on a Self-Managed installation or gitlab.com namespace with a paid plan type/tier. See Paid Stage Monthly Active Users - Paid SMAU as an example.
In order to determine if a self-managed installation or gitlab.com/SaaS namespace is paid, we use the plan type/tier, not the presence of ARR. Those on a paid plan type (ex: Premium, Ultimate, etc) are considered to be paid. This means that namespaces or installations belonging to an OSS or EDU program, internal project, or other subscription that has a paid plan type but does not contribute ARR are considered to be "paid".
Here are more specifics on this identification:
edition field
in the Service Ping payload,
selecting only service pings with EEP, EES and EEU edition. The edition value is derived
from the plan column in the license table in the CustomersDot database at the time the license
was generated (internal link).To reiterate, we do not exclude EDU/OSS subscriptions from the paid xMAU calculations.
We have 2 main data sources to calculate xMAU and paid xMAU, the Versions App (Service Ping) and the Gitlab.com Postgres database. The table below summarizes which data source is used for those calculations.
| Delivery | Total xMAU | Paid xMAU |
|---|---|---|
| Self-Managed | Versions App | Versions App |
| SaaS | Versions App | Gitlab.com Postgres Replica |
For total SaaS xMAU, we use the Service Ping payloads generated for the GitLab.com production
installation. These payloads are easily identifiable since they are linked to an instance with
uuid = 'ea8bf810-1d6f-4a6a-b4fd-93e8cbd8b57f' AND host_name = 'gitlab.com'. (Note: uuid is
synonymous with dim_instance_id in our data models). You can also simply filter on
dim_installation_id = '8b52effca410f0a380b0fcffaa1260e7', which is unique to the gitlab.com
production installation. These filters ensure that data from non-production gitlab.com
installations (ex: staging.gitlab.com) is not included in total SaaS xMAU or PIs.
Since Service Ping is reported at an installation-level, there is not a way that we can differentiate paid from total usage within the metrics. For self-managed instances, we have the license and plan type, so it is easy to attribute a subset of pings to paid xMAU. However, since gitlab.com is a single installation reporting a single ping, we do not have a way to break down the aggregates by product tier, plan type, or namespace. As a work-around, we replicate Service Ping metrics using the Gitlab.com Postgres replica tables. The challenge comes in that we can only replicate a subset of Service Ping metrics, database metrics, and we are not able to replicate Redis counters.
Therefore, only some metrics can be recreated using the Gitlab.com Postgres replica. That means that, for now, we are not able to calculate some of the Paid SaaS xMAU metrics like Monitor SMAU. Product Intelligence is actively working to find a way to replicate Redis counters (Epic here).
As mentioned above, there are 2 main data sources used for xMAU analysis:
One of our goals is to create a single model that easily provides all the data needed for reporting and analysis. As we continue to iterate on our solutions, we know that there will be information that is not always available in this model. Here is where understanding the Entity Relationship Diagram helps. This model shows which tables are joined to create the layer you are accessing. This is really when you are looking to dive deeper and gain additional insight!
It can also be helpful to look at the data model lineages in dbt:
prep_event)We have built a suite of data marts that allow users to explore our different product data
sources. "mart" models are a combination of dimensions and facts that are joined together to
enable easy analysis. "rpt" ("report") models are built with specific business logic for a
specific use case. (Ex: rpt_ping_metric_totals_w_estimates_monthly has custom logic to
generate xMAU estimations). Underneath each mart or reporting model is a clean lineage of
dimensions and facts that can also be used for analysis. This list is limited to the key marts
designed for stakeholders to do everyday analysis and reporting. You can read more about
GitLab's Enterprise Dimensional Model (EDM) here.
| Data Mart/Rpt Name | Grain* | Source |
|---|---|---|
| mart_ping_instance | Service Ping Instance ID | Versions App |
| mart_ping_instance_metric | Service Ping Instance ID, Metrics Path | Versions App |
| mart_ping_instance_metric_monthly | Service Ping Instance ID, Metrics Path (limited to the last ping of the month per installation) | Versions App |
| rpt_ping_metric_first_last_versions | Ping Edition, Metrics Path | Versions App |
| rpt_ping_latest_subscriptions_monthly | Month, Subscription, Installation (if available) | Versions App |
| rpt_ping_metric_totals_w_estimates_monthly | Reporting Month, Metrics Path, Estimation Grain, Ping Edition Product Tier, Service Ping Delivery Type | Versions App |
| mart_event_valid | Event (atomic-level model) | Gitlab.com Postgres Replica |
| mart_event_user_daily | Event Name, Event Date, User ID, Ultimate Parent Namespace ID | Gitlab.com Postgres Replica |
| mart_event_namespace_daily | Event Name, Event Date, Ultimate Parent Namespace ID | Gitlab.com Postgres Replica |
| rpt_event_plan_monthly | Reporting Month, Plan ID at Event Date, Event Name | Gitlab.com Postgres Replica |
| rpt_event_xmau_metric_monthly | Reporting Month, User Group (total, free, paid), Section Name, Stage Name, Group Name | Gitlab.com Postgres Replica |
* Please see the linked dbt docs for information about each specific model, applied business logic, etc.
common_mart.mart_ping_instance_metric
is the most comprehensive of the Service Ping data marts. (Note: unfiltered Service Ping data
sets are available in the common schema). This data model provides ping- and metric-level
data, and joins the Service Ping data with financial and GTM data sources such as subscription,
CRM Account, etc. This model also includes flags related to a metric's time period and whether
it is currently mapped to xMAU.
This mart allows users to retrieve usage data for 7-day, 28-day, and all-time metrics. Read more about metric time frames here.
common_mart_product.rpt_ping_metric_totals_w_estimates_monthly
is a customized model designed for monthly Service Ping-generated xMAU and PI reporting,
including estimated uplift.
End-users can then use very simple queries to produce xMAU and PI visualizations:
SELECT
ping_created_at_month,
ping_delivery_type,
ping_product_tier,
SUM(recorded_usage) AS recorded_usage,
SUM(estimated_usage) AS estimated_usage,
SUM(total_usage_with_estimate) AS total_usage_with_estimate
FROM common_mart_product.rpt_ping_metric_totals_w_estimates_monthly
WHERE is_smau = TRUE
AND stage_name = 'create'
AND estimation_grain = 'metric/version check - subscription based estimation'
GROUP BY 1,2,3
ORDER BY 1,2,3
This model also enables easy comparison of one estimation methodology vs another (referred to
as estimation_grain in the model). At the time of rollout in July 2022, 4 different
methodologies will be available in this model, with options to add more in the future. A more
detailed explanation of each estimation methodology is available on this page.
common_mart.mart_event_valid
is an atomic-/event-level model which has been enhanced for ease of analysis. It incorporates basic
business logic* that removes potentially misleading data (ex: events from blocked users) and is
flexible enough to allow the end user to aggregate and dedupe data, as desired.
*Please see dbt docs for full details on business logic
common_mart_product.rpt_event_xmau_metric_monthly
is a customized model designed for monthly paid SaaS xMAU reporting. This model provides user
counts at the xMAU metric-level (which is not necessarily synonymous with the event-level),
limited to the appropriate time frame (last 28 days of the month).
Please check out the Product Manager Toolkit for more information on how to use xMAU-related snippets.
Due to the sensitive nature of metrics like user counts, PI charts are not publicly accessible and must reside in the internal handbook. However, this data is not considered to be SAFE and therefore is visible to all GitLab team members and is available in the general GitLab space in Sisense.
Some data supporting xMAU Analysis is classified as Orange or Yellow. This includes Orange customer metadata from the account, contact data from Salesforce and Zuora and GitLab's Non public financial information, none of which should be publicly available. Care should be taken when sharing data from this dashboard to ensure that the detail stays within GitLab as an organization and that appropriate approvals are given for any external sharing. In addition, when working with row or record level customer metadata care should always be taken to avoid saving any data on personal devices or laptops. This data should remain in Snowflake and Sisense and should ideally be shared only through those applications unless otherwise approved.
Orange
dim_billing_accountdim_crm_accountmart_ping_instanceSystem Owners
TBDTBDPlease add feedback to this issue