This is to demo work currently done as part of data team's effort to develop EDW using dimensional modeling.
Dimensional modeling is part of the Business Dimensional Lifecycle methodology developed by Ralph Kimball which includes a set of methods, techniques and concepts for use in data warehouse design.
a logical design technique that seeks to present the data in a standard, intuitive framework that allows for high-performance access
Dimensional Modeling is business process oriented and can be built in 4 steps:
Dimensional modeling always uses the concepts of facts (measures), and dimensions (context). Facts are typically (but not always) numeric values that can be aggregated, and dimensions are groups of hierarchies and descriptors that define the facts.
In the simplest version fact table is a central table and is linked to dimensional tables with foreign keys creating a star schema. Star schema with dimensional tables linking to more dimensional tables are called snowflake schemas, multi fact tables schemas are called galaxies.
OKR: Reporting on both ARR and Customer counts entirely supported in a Kimball Dimensional Warehouse
The graphical schema of dimensional model developed for calculating ARR/ Customer count
More on use and conventions of Dimensional Modeling in Data Team Handbook here
Fct_ and dim_ dbt models are created in
COMMON schema indicating that they are not source related and create basis to build data marts.
In Snowflake they can be found in
The advantage of dimensional schema is ease of querying and building data marts.
Data mart is where the dimensions labels selection and aggregation can happen.
SELECT COALESCE(dim_customers.merged_to_account_id , dim_customers.CRM_ID) AS customer_id, subscription_name_slugify, dateadd('month',-1,dim_dates.first_day_of_month) as mrr_month, mrr FROM ANALYTICS.ANALYTICS_STAGING.dim_dates INNER JOIN ANALYTICS.ANALYTICS_STAGING.fct_charges ON fct_charges.effective_start_date_id <= dim_dates.date_id AND fct_charges.effective_end_date_id > dim_dates.date_id AND dim_dates.day_of_month=1 INNER JOIN ANALYTICS.ANALYTICS_STAGING.fct_invoice_items_agg ON fct_charges.charge_id = fct_invoice_items_agg.charge_id INNER JOIN ANALYTICS.ANALYTICS_STAGING.dim_subscriptions ON dim_subscriptions.SUBSCRIPTION_ID = fct_charges.subscription_id INNER JOIN ANALYTICS.ANALYTICS_STAGING.dim_customers ON dim_customers.crm_id = dim_subscriptions.crm_id;