This page contains forward-looking content and may not accurately reflect current-state or planned feature sets or capabilities.
Public companies need to reliably and predictably share key financial, customer, and growth metrics as well as analyze lead-to-cash and product idea-to-adoption processes to continually improve business performance. These activities are supported by capabilities defined in Level 2 of the Data Capability Model. To provide a realistic example and to serve as a reference for future development, this page presents the Level 2 Data Solution for 'Product Geolocation Analysis'.
Understanding where your product is used around the world is an important step towards developing a more complete understanding of your customers, your product's global reach, and related location-aware insights.
This data solution delivers three Self-Service Data capabilities:
From a Data Platform perspective, the solution delivers:
dim_country
tableFinally, this is the long-term automated solution for several ad-hoc issues completed over the past year, including:
The Self-Service Data Certificate program is based on the Learning and Development Certification program. The Self-Service Data program provides individual Certificates for each subject-oriented Dashboard Developer or SQL Developer Knowledge Assessment successfully completed. Links to the Knowledge Assessments are located in the appropriate sections below.
@rparker2
@jeromezng
@rparker2
Dashboard | Purpose |
---|---|
Worldwide Product Growth | Visualize the adoption of GitLab by country, region, and time. |
Data Health Dashboard for Geolocation Data | Data Health of the Geolocation data used to support this solution. |
Data Space | Description |
---|---|
Global | Contains a data model containing a 1-1 relationship with the Product Geolocation Analysis model detailed below |
To receive a Certificate, you will need to earn 100% on the Self-Service Dashboard Developer Knowledge Assessment and upload a screenshot of your new dashboard when prompted. Upon completion of the Knowledge Assessment, you will be emailed your responses and this email will serve as your Certificate.
Diagram/Entity | Grain | Purpose | Keywords |
---|---|---|---|
Product Geolocation Analysis | Activity By Day | Dimensions and Facts that can be used to analyze GitLab usage by country, territory, and time. | dim_date, dim_country, fct_country_activity_by_day |
dim_date | Day | Central dimension for all dates. | |
dim_country | ISO_Country | Central dimension for all countries and territories, sourced from ISO-3166 and GitLab Sales Territories |
All production SQL in Sisense or dbt must adhere to our SQL Style Guide for legibility and maintainability.
SELECT
f.date_key AS date_key,
dc.country_name AS country_name,
SUM(f.num_page_views) AS number_of_page_views
FROM fct_country_activity_by_day f
JOIN dim_country dc
ON f.country_key = f.country_key
WHERE dc.reporting_region = 'NORAM'
SELECT
dc.country_name AS country_name,
f.namespace_key AS namespace_key,
SUM(f.num_page_views) AS number_of_total_page_views
FROM fct_country_activity_by_day f
JOIN dim_country dc
ON f.country_key = f.country_key
WHERE dd.year = 2020
GROUP BY dc.country_name, f.namespace_key
PARTITION BY dc.country_name LIMIT 100
To receive a Certificate, you will need to earn 100% on the Self-Service SQL Developer Knowledge Assessment and upload a screenshot of your new SQL statement when prompted. Upon completion of the Knowledge Assessment, you will be emailed your responses and this email will serve as your Certificate.
The overall solution adheres to our Enterprise Dimensional Model guidelines.
See dbt documentation for a complete lineage graph.
The dbt solution generates a dimensional model from RAW source data.
Validation | Expected Result |
---|---|
1 | Total number of countries mapped does not exceed 300. |
2 | Percentage of traffic from APAC is not greater than AMER. |
3 | >40,000 # New fct_country_activity_by_day rows added by Day. |
Validation | Expected Result |
---|---|
1 | New usage_ping data has been uploaded in last 14 days. |
2 | Total # of accounts represented by usage_ping data >= expected result. |
3 | Total # of accounts represented by snowplow data >= expected result. |