Requests to expedite responses, triage issues, or review MRs is rare. Given the Data Team's shared-service model, expediting an item is asking to de-prioritize other work. To request an expedited response:
The data team's priorities come from our OKRs. We do our best to service as many of the requests from the organization as possible. You know that work has started on a request when it has been assigned to a milestone. Please communicate in the issue about any pressing priorities or timelines that may affect the data team's prioritization decisions. Please do not DM a member of the data team asking for an update on your request. Please keep the communication in the issue.
The data team, like the rest of GitLab, works hard to document as much as possible. We believe this framework for types of documentation from Divio is quite valuable. For the most part, what's captured in the handbook are tutorials, how-to guides, and explanations, while reference documentation lives within in the primary analytics project. We have aspirations to tag our documentation with the appropriate function as well as clearly articulate the assumed audiences for each piece of documentation.
As a central shared service with finite time and capacity and with a responsibility to operate and develop the company's central Enterprise Data Warehouse, the Data Team must focus its time and energy on initiatives that will yield the greatest positive impact to the overall global organization towards improving customer results.
The Data Team uses a Value Calculator to quantify the business value of new initiatives (issue, epic, OKR, strategic project) to enable prioritization and ranking of the Data Team development queue. The Value Calculator provides a uniform and transparent mechanism for ranking and enables all work to be evaluated on equal terms. The value calculator approach is similar to the RICE Scoring Model for Product Managers and the Demand Metric Prioritization Model for Marketing.
Every day in Data brings a new challenge or opportunity. However, The Data Team strives to spend the majority of its time developing and operating the Enterprise Data Warehouse and related systems, keeping fresh data flowing through the system, regularly expanding the breadth of data available for analysis, and delivering high-impact strategic projects. Our standing priorities are listed below.
|Rank||Priority||Description||Allocation We Aspire To|
|1||Production Operations||Activities required to maintain efficient and reliable data services, including triage, bug fixes, and patching to meet established Service Level Objectives.||10-20%, though will fluctuate as driven by incident frequency and complexity|
|2||Data Team OKRs||The Data Team identifies 3-4 strategic-level OKRs per quarter, primarily focused on core infrastructure and data development that will be beneficial to the entire company, aligned with CEO and Finance OKRs.||60-75%, though this will fluctuate as driven by larger Functional Team OKRs|
|3||Other work||Other work, including Functional Team OKRs, as prioritized and ranked using the Value Calculator.||15-25%|
We use scoped labels in GitLab to track our issues across these priorities.
In rare situations established SLOs do not meet turnaround needs and in these cases the Data Team provides an expedite response capability. The Data Team will provide an date estimate if expedited request cannot be handled per the expedite response SLO.
The calculator below is based on the following Value Calculator spreadsheet. Please select the values below to define the value of new work.
The Data Team OKRs aspire to align with Business Operations OKRs, Finance Division OKRs, and CEO OKRs, thereby aligning with the OKRs of the Divisions we support. Due to the nature of the the technical and data infrastructure work required to develop and operate an Enterprise Data Warehouse this will not always be the case.
At the beginning of a FQ, the Data Team will outline all actions that are required to succeed with our KRs and in helping other teams measure the success of their KRs. The best way to do that is via a team brain dump session in which everyone lays out all the steps they anticipate for each of the relevant actions. This is a great time for the team to raise any blockers or concerns they foresee. These should be recorded for future reference.
These OKRs drive ~60% of the work that the central data team does in a given quarter. The remaining time is divided between urgent issues that come up and ad hoc/exploratory analyses.
Examples of OKR alignment in-action includes:
The data team currently works in two-week intervals, called milestones. Milestones start on Tuesdays and end on Mondays. This discourages last-minute merging on Fridays and allows the team to have milestone planning meetings at the top of the milestone.
Milestones may be three weeks long if they cover a major holiday or if the majority of the team is on vacation or at Contribute. As work is assigned to a person and a milestone, it gets a weight assigned to it.
Milestone planning should take into consideration:
The milestone planning is owned by the Manager, Data.
The timeline for milestone planning is as follows:
|Day||Current Milestone||Next Milestone|
|0 - 1st Wednesday||Milestone Start
|7 - 2nd Tuesday||Midpoint
Any issues that are at risk of slipping from the milestone must be raised by the assignee
|10 - 2nd Friday||The last day to submit MRs for review
MRs must include documentation and testing to be ready to merge
No MRs are to be merged on Fridays
|Milestone is roughly final
Milestone Planner verifies issue priority and team capacity for next milestone.
|13 - 2nd Monday||Last day of Milestone
Ready MRs can be merged
|14 - 2nd Tuesday||Meeting Day
All unfinished issues either need to be removed from milestones or rolled to the next
Scheduled DE meeting with a tactical discussion of the work to be completed next Milestone. Stakeholders and submitters are updated with what will or wont be added to the next milestone.
The short-term goal of this process is to improve our ability to plan and estimate work through better understanding of our velocity. In order to successfully evaluate how we're performing against the plan, any issues not raised at the T+7 mark should not be moved until the next milestone begins.
During the milestone planning process, we point issues. Then we pull into the milestone the issues expected to be completed in the timeframe. Points are a good measure of consistency, as milestone over milestone should share an average. Then issues are prioritized according to these categories.
Issues are not assigned to individual members of the team, except where necessary, until someone is ready to work on it. Work is not assigned and then managed into a milestone. Every person works on the top priority issue for their job type. As that issue is completed, they can pick up the next highest priority issue. People will likely be working on no more than 2 issues at a time.
Data Engineers will work on Infrastructure issues.
There is a demo of what this proposal would look like in a board.
This approach has many benefits, including:
There are three general types of issues:
Not all issues will fall into one of these buckets but 85% should.
Some issues may need a discovery period to understand requirements, gather feedback, or explore the work that needs to be done. Discovery issues are usually 2 points.
Introducing a new data source requires a heavy lift of understanding that new data source, mapping field names to logic, documenting those, and understanding what issues are being delivered. Usually introducing a new data source is coupled with replicating an existing dashboard from the other data source. This helps verify that numbers are accurate and the original data source and the data team's analysis are using the same definitions.
This umbrella term helps capture:
It is the responsibility of the assignee to be clear on what the scope of their issue is. A well-defined issue has a clearly outlined problem statement. Complex or new issues may also include an outline (not all encompassing list) of what steps need to be taken. If an issue is not well-scoped as its assigned, it is the responsibility of the assignee to understand how to scope that issue properly and approach the appropriate team members for guidance early in the milestone.
Incidents are times when a problem is discovered and some near term action is required to fix the issue. When this happens, we make an Incident Issue in the Data Team Project. The process for working through incidents is as follows:
Data Team Incidents can be reviewed in Incident Overview page within the main project.
|Stage (Label)||Responsible||Description||Completion Criteria|
||Data||New issue, being assessed||Item has enough information to enter problem validation.|
||Data, Business DRI||Clarfying issue scope and prosing solution||Solution defined with sign off from business owners on proposed solution that is valuable, usable, viable and feasible|
||Data||Waiting for scheduling||Item has a numerical milestone label|
||Data||Waiting for development||Data team has started development|
||Data||Solution is actively being developed||Initial engineering work is complete and review process has started|
||Data||Waiting for or in Review||MR(s) are merged. Issues had all conversations wrapped up.|
||Data, Business DRI||Issue needs intervention that assignee can't perform||Work is no longer blocked|
Generally issues should move through this process linearly. Some templated issues will skip from
Issue pointing captures the complexity of an issue, not the time it takes to complete an issue. That is why pointing is independent of who the issue assignee is.
|Null||Meta and Discussions that don't result in an MR|
|0||Should not be used.|
|1||The simplest possible change including documentation changes. We are confident there will be no side effects.|
|2||A simple change (minimal code changes), where we understand all of the requirements.|
|3||A typical change, with understood requirements but some complicating factors|
|5||A more complex change. Requirements are probably understood or there might be dependencies outside the data-team.|
|8||A complex change, that will involve much of the codebase or will require lots of input from others to determine the requirements.|
|13||A significant change that has dependencies and we likely still don't understand all of the requirements. It's unlikely we would commit to this in a milestone, and the preference would be to further clarify requirements and/or break into smaller Issues.|
Think of each of these groups of labels as ways of bucketing the work done. All issues should get the following classes of labels assigned to them:
Optional labels that are useful to communicate state or other priority
State (Red) (Won't Do, Blocked, Needs Consensus, etc.)
Inbound: For issues created by folks who are not on the data team; not for asks created by data team members on behalf of others
Ideally, your workflow should be as follows:
Update the MR with an appropriate template. Our current templates are:
Run any relevant jobs to the work being proposed
Assign the MR to a peer to have it reviewed. If assigning to someone who can merge, either leave a comment asking for a review without merge, or you can simply leave the
cc @userin a comment.
WIP:label, mark the branch for deletion, mark squash commits, and assign to the project's maintainer. Ensure that the attached issue is appropriately labeled and pointed.
The Merge Request Workflow provides clear expectations; however, there is some wiggle room and freedom around certain steps as follows.
We encourage everyone to record videos and post to GitLab Unfiltered. The handbook page on YouTube does an excellent job of telling why we should be doing this. If you're uploading a video for the data team, be sure to do the following extra steps:
dataas a video tag