This handbook page is intended to provide a high-level overview of the most common data models used by the People Analytics team. If you are in need of using People Data for dashboard development and reporting, this page will help you determine the best models to use. If you have any questions please don't hesitate to reach out to someone from the team for help.
DBT Docs - This resource contains comprehensive documentation on all available dbt models. This is a great starting point to understanding our models. For specific People Models, please reference the Commonly Used Data Models section below for a starting point.
Definitive guides to data subject areas managed by the Data team.
Documentation on data pipelines for the technically curious analyst. This page goes into each data source and extraction details.
People Group Tech Stack Guide for overview of all the integrations that go into and out of our HR systems and all the tools we use.
BambooHR - BambooHR was the HRIS used by GitLab until 2022-06-16
Workday - Workday is the HRIS used by GitLab since 2022-06-16
Employee ID - Unique number that has been assigned to each individual team member within the company
Team Member - Full time individuals working for GitLab. Team members can be hired through an entity, PEO or direct contract. Individuals are considered team members in the warehouse when on an indefinite term contract with GitLab. Temporary Service Providers or Contingent Workers (working for GitLab for a limited period of time) are not tracked in Workday, therefore their records don't exist in Snowflake
Organization - An organization refers to a grouping used to organize team members or other organizations
Team ID - The unique organization identifier
Team Superior ID Organization's superior team ID
Department - The second level of the GitLab organization hierarchy
Division - The top level of the GitLab organization hierarchy
Cost Center - The reported top level cost center for the team member
BambooHR was used by GitLab to store all team member information in one central location.
GitLab’s ATS (Applicant Tracking System). All hiring managers and interviewers will use Greenhouse to review resumes, provide feedback, communicate with candidates, and more. Please refer to the hiring handbook for additional information.
A slack application used to request, track, approve and analyze time off.
Workday is GitLab's current central HRIS.
The application we use to conduct surveys within GitLab.
The Data Team is working with the People Analytics team to build data marts that allow users to explore our different people data sources
Underneath each mart or reporting model is a clean lineage of dimensions and facts that can also be used for analysis. This list included all the prep tables, dimensions and facts that have been created for people analysis as well as the ones that we are planning to build in the upcoming quarters based on the Team Member Common ERD
Model Name | Table Type | Grain | Source | Status |
---|---|---|---|---|
prep_team_member | Staging | One row per Team Member ID per event | Workday | Completed |
dim_team | Dimension | One row per Team ID per event | Workday | Completed |
dim_team_member | Dimension | One row per Team Member ID per event | Workday | Completed |
dim_locality | Dimension | Workday | Planned | |
fct_team_member_position | Fact | Workday | Planned | |
fct_team_member_status | Fact | Workday | Planned | |
fct_team_status | Fact | Workday | Planned | |
fct_team_member_locality | Fact | Workday | Planned | |
fct_team_demographic | Fact | Workday | Planned | |
fct_team_member_absence | Fact | Workday | Planned | |
fct_team_absence | Fact | Workday | Planned |
This table contains team members work and personal information. Sensitive columns are masked using dynamic masking and the fields are only visible by team members with the analyst_people role assigned in Snowflake. This table is a hybrid SCD (Type 1 + Type 2).
The table includes information regarding current team members, new hires who have records created in Workday before their start date and team members who were terminated in 2021 onwards. Team members who were terminated before 2021 are not captured in this model at this time. The grain of this table is one row per employee_id per valid_to/valid_from combination.
SELECT
region,
COUNT(DISTINCT employee_id)
FROM
PROD.COMMON.DIM_TEAM_MEMBER
WHERE
is_current = TRUE AND is_current_team_member = TRUE
GROUP BY
region
SELECT
COUNT(DISTINCT employee_id)
FROM
PROD.COMMON.DIM_TEAM_MEMBER
WHERE
is_current = TRUE AND is_current_team_member = TRUE
SELECT
key_talent_status,
COUNT(*) * 100 / SUM(COUNT(*)) OVER() AS key_talent_percentage
FROM
PROD.COMMON.DIM_TEAM_MEMBER
WHERE
is_current = TRUE AND is_current_team_member = TRUE
GROUP BY 1
dim_team contains team (organizations) information. It includes information regarding teams and their hierarchy. It is a Type 2 SCD.
The goal of this table is to determine the team superior organization and the hierarchy of every organization in Workday. Additionally, this table provides team data such as team name, manager, team member count and date of inactivation for each organization.
The grain of this table is one row per Team ID per event.
SELECT
COUNT(*)
FROM
PREP.SENSITIVE.DIM_TEAM
WHERE
is_current = TRUE AND is_team_active = TRUE
SELECT
SUM(team_members_count)
FROM
PREP.SENSITIVE.DIM_TEAM
WHERE
is_current = TRUE AND is_team_active = TRUE
Legacy models are models we will be transitioning from at some point but are still using for reporting.
Database | Schema | Table Name | Data Grain | Description | Notes |
---|---|---|---|---|---|
prod | legacy | employee_directory_analysis | employee_id by date_actual |
Gives the current state of the employees at GitLab at any point of time. This is the model to use for headcount, team size, or any people-related analysis for employees. This has current and past employees, as well as their department, division, and cost center and hire/termination dates. | |
prod | legacy | bamboohr_rpt_headcount_aggregation | department , division , eeoc_value |
This report creates out a headcount report from the bamboohr_headcount_intermediate to be used for Sisense dashboards for each month. | The division reporting is based on current division used. |
prep | sensitive | employee_directory_intermediate | employee_id by date_actual |
INCLUDES SENSITIVE INFORMATION. The master collection of all info about GitLab employees for their latest position. | |
prep | sensitive | bamboohr_employment_status_xf | employee_id by valid_from_date |
This model provides a transaction record of an employee's status changes (i.e. active, leave, termed). It helps identify when an employee is re-hired, and provides termination type | |
prep | sensitive | bamboohr_promotions_xf | employee_id by promotion_date and compensation_sequence |
This model identifies all individuals that were promoted and the compensation change associated to the promotion. The total compensation change is equal to the change in compensation (from bamboohr_compensation model) times the pay frequency and currency conversion at time of promotion + change in OTE(USD) at time of promotion. In the case the team member is hourly, we use the bamboohr_currency_conversion table. | |
prep | sensitive | bamboohr_id_employee_number_mapping | employee_id |
This model is the canonical source mapping bamboo employee numbers with employee IDs. It includes all employees for all time. The model also captures demographic information, and removes test accounts. | |
prep | sensitive | bamboohr_separations | employee_id |
Provides a report of all separated team members. | |
prep | sensitive | workday_terminations | employee_id |
Provides the termination reason, and exit impact to allow the People Analytics team to accurately report on termination data | |
prep | workday | blended_directory_source | employee_id by uploaded_at and source_system |
Daily upload of employee data used for downstream models. | Helpful source for auditing any data issues in Snowflake |
Database | Schema | Table Name | Data Grain | Description | Notes |
---|---|---|---|---|---|
prod | workspace_people | rpt_hires | application_id |
This is a report specifically for talent acquisition and counts accepted offers as hires. | |
prep | greenhouse | greenhouse_application_stages_source | application_id by stage_id and stage_entered_on |
This table is Historical activity of all stages an application can be in Each row represents a stage that an application can be in, and the timestamp that the application entered and exited the stage Things to note: This table contains a row for each stage that an application can be in (taken from the job that the application is on). Thus, there may be rows for stages that an application has yet to reach, or will not reach (if the application was rejected). | |
prep | sensitive | greenhouse_recruiting_xf | application_id |
This is shows all applications submitted and ties in offer data, greenhouse department (the associated division), and source data to be in 1 place. |