Analysis usually begins with a question. A stakeholder will ask a question of the data team by creating an issue in the Data Team project using the appropriate template. The analyst assigned to the project may schedule a discussion with the stakeholder(s) to further understand the needs of the analysis, though the preference is always for async communication. This meeting will allow for analysts to understand the overall goals of the analysis, not just the singular question being asked, and should be recorded. All findings should be documented in the issue. Analysts looking for some place to start the discussion can start by asking:
An analyst will then update the issue to reflect their understanding of the project at hand. This may mean turning an existing issue into a meta issue or an epic. Stakeholders are encouraged to engage on the appropriate issues. The issue then becomes the SSOT for the status of the project, indicating the milestone to which it has been assigned and the analyst working on it, among other things. The issue should always contain information on the project's status, including any blockers that can help explain its prioritization. Barring any confidentiality concerns, the issue is also where the final project will be delivered, after peer/technical review. When satisfied, the analyst will close the issue. If the stakeholder would like to request a change after the issue has been closed, s/he should create a new issue and link to the closed issue.
The Data Team can be found in the #data channel on slack.
Requests to expedite responses, triage issues, or review MRs is rare. Given the Data Team's shared-service model, expediting an item is asking to de-prioritize other work. To request an expedited response:
The data team's priorities come from our OKRs. We do our best to service as many of the requests from the organization as possible. You know that work has started on a request when it has been assigned to a milestone. Please communicate in the issue about any pressing priorities or timelines that may affect the data team's prioritization decisions. Please do not DM a member of the data team asking for an update on your request. Please keep the communication in the issue.
The data team, like the rest of GitLab, works hard to document as much as possible. We believe this framework for types of documentation from Divio is quite valuable. For the most part, what's captured in the handbook are tutorials, how-to guides, and explanations, while reference documentation lives within in the primary analytics project. We have aspirations to tag our documentation with the appropriate function as well as clearly articulate the assumed audiences for each piece of documentation.
As a central shared service with finite time and capacity and with a responsibility to operate and develop the company's central Enterprise Data Warehouse, the Data Team must focus its time and energy on initiatives that will yield the greatest positive impact to the overall global organization towards improving customer results.
The Data Team uses a Value Calculator to quantify the business value of new initiatives (issue, epic, OKR, strategic project) to enable prioritization and ranking of the Data Team development queue. The Value Calculator provides a uniform and transparent mechanism for ranking and enables all work to be evaluated on equal terms. The value calculator approach is similar to the RICE Scoring Model for Product Managers and the Demand Metric Prioritization Model for Marketing.
Every day in Data brings a new challenge or opportunity. However, The Data Team strives to spend the majority of its time developing and operating the Enterprise Data Warehouse and related systems, keeping fresh data flowing through the system, regularly expanding the breadth of data available for analysis, and delivering high-impact strategic projects. Our standing priorities are listed below.
|Rank||Priority||Description||Allocation We Aspire To|
|1||Production Operations||Activities required to maintain efficient and reliable data services, including triage, bug fixes, and patching to meet established Service Level Objectives.||10-20%, though will fluctuate as driven by incident frequency and complexity|
|2||Data Team OKRs||The Data Team identifies 3-4 strategic-level OKRs per quarter, primarily focused on core infrastructure and data development that will be beneficial to the entire company, aligned with CEO and Finance OKRs.||60-75%, though this will fluctuate as driven by larger Functional Team OKRs|
|3||Other work||Other work, including Functional Team OKRs, as prioritized and ranked using the Value Calculator.||15-25%|
In rare situations established SLOs do not meet turnaround needs and in these cases the Data Team provides an expedite response capability. The Data Team will provide an date estimate if expedited request cannot be handled per the expedite response SLO.
The calculator below is based on the following Value Calculator spreadsheet. Please select the values below to define the value of new work.
The Data Team OKRs aspire to align with Business Operations OKRs, Finance Division OKRs, and CEO OKRs, thereby aligning with the OKRs of the Divisions we support. Due to the nature of the the technical and data infrastructure work required to develop and operate an Enterprise Data Warehouse this will not always be the case.
At the beginning of a FQ, the Data Team will outline all actions that are required to succeed with our KRs and in helping other teams measure the success of their KRs. The best way to do that is via a team brain dump session in which everyone lays out all the steps they anticipate for each of the relevant actions. This is a great time for the team to raise any blockers or concerns they foresee. These should be recorded for future reference.
These OKRs drive ~60% of the work that the central data team does in a given quarter. The remaining time is divided between urgent issues that come up and ad hoc/exploratory analyses. Specialty data analysts (who have the title "Data Analyst, Specialty") should have a similar break down of planned work to responsive work, but their priorities are set by their specialty manager.
Examples of OKR alignment in-action includes:
The data team currently works in two-week intervals, called milestones. Milestones start on Tuesdays and end on Mondays. This discourages last-minute merging on Fridays and allows the team to have milestone planning meetings at the top of the milestone.
Milestones may be three weeks long if they cover a major holiday or if the majority of the team is on vacation or at Contribute. As work is assigned to a person and a milestone, it gets a weight assigned to it.
Milestone planning should take into consideration:
The milestone planning is owned by the Manager, Data.
The timeline for milestone planning is as follows:
|Day||Current Milestone||Next Milestone|
|0 - 1st Wednesday||Milestone Start
|7 - 2nd Tuesday||Midpoint
Any issues that are at risk of slipping from the milestone must be raised by the assignee
|10 - 2nd Friday||The last day to submit MRs for review
MRs must include documentation and testing to be ready to merge
No MRs are to be merged on Fridays
|Milestone is roughly final
Milestone Planner distributes issues to team members, with the appropriate considerations and preferences
|13 - 2nd Monday||Last day of Milestone
Ready MRs can be merged
|14 - 2nd Tuesday||Meeting Day
All unfinished issues either need to be removed from milestones or rolled to the next
Scheduled DE meeting with a tactical discussion of the work to be completed next Milestone. Stakeholders and submitters are updated with what will or wont be added to the next milestone.
The short-term goal of this process is to improve our ability to plan and estimate work through better understanding of our velocity. In order to successfully evaluate how we're performing against the plan, any issues not raised at the T+7 mark should not be moved until the next milestone begins.
The work of the data team generally falls into the following categories:
During the milestone planning process, we point issues. Then we pull into the milestone the issues expected to be completed in the timeframe. Points are a good measure of consistency, as milestone over milestone should share an average. Then issues are prioritized according to these categories.
Issues are not assigned to individual members of the team, except where necessary, until someone is ready to work on it. Work is not assigned and then managed into a milestone. Every person works on the top priority issue for their job type. As that issue is completed, they can pick up the next highest priority issue. People will likely be working on no more than 2 issues at a time.
Given the power of the Ivy Lee method, this allows the team to collectively work on priorities as opposed to creating a backlog for any given person. As a tradeoff, this also means that every time a central analyst is introduced to a new data source their velocity may temporarily decrease as they come up to speed; the overall benefit to the organization that any analyst can pick up any issue will compensate for this, though. Learn how the product managers groom issues.
Data Engineers will work on Infrastructure issues.
Data Analysts, Central and sometimes Data Engineers work on general Analytics issues.
There is a demo of what this proposal would look like in a board.
This approach has many benefits, including:
There are three general types of issues:
Not all issues will fall into one of these buckets but 85% should.
Some issues may need a discovery period to understand requirements, gather feedback, or explore the work that needs to be done. Discovery issues are usually 2 points.
Introducing a new data source requires a heavy lift of understanding that new data source, mapping field names to logic, documenting those, and understanding what issues are being delivered. Usually introducing a new data source is coupled with replicating an existing dashboard from the other data source. This helps verify that numbers are accurate and the original data source and the data team's analysis are using the same definitions.
This umbrella term helps capture:
It is the responsibility of the assignee to be clear on what the scope of their issue is. A well-defined issue has a clearly outlined problem statement. Complex or new issues may also include an outline (not all encompassing list) of what steps need to be taken. If an issue is not well-scoped as its assigned, it is the responsibility of the assignee to understand how to scope that issue properly and approach the appropriate team members for guidance early in the milestone.
|Stage (Label)||Track||Responsible||Completion Criteria||Who Transitions Out|
||Validation||Data||Item has enough information to enter problem validation.||Data|
||Validation||Data, Business DRI||Item is validated and defined enough to propose a solution||Data|
||Validation||Data||Design work is complete enough for issue to be implemented||Data|
||Validation||Data, Business DRI||Sign off from business owners on proposed solution that is valuable, usable, viable and feasible||Business DRI|
||Planning||Data||Item has a numerical milestone label||Data|
||Planning||Data||Issue has a numerical milestone label||Data|
||Build||Data||A data team member has started to work on the issue||Data|
||Build||Data||Initial engineering work is complete and review process has started||Data|
||Build||Data||MR(s) are merged. Issues had all conversations wrapped up.||Data|
||Build||Data, Business DRI||Work is demonstrable on production||N/A|
||Planning||Data, Business DRI||Work is no longer blocked||Data|
Issue pointing captures the complexity of an issue, not the time it takes to complete an issue. That is why pointing is independent of who the issue assignee is.
|Null||Meta and Discussions that don't result in an MR|
|0||Should not be used.|
|1||The simplest possible change including documentation changes. We are confident there will be no side effects.|
|2||A simple change (minimal code changes), where we understand all of the requirements.|
|3||A typical change, with understood requirements but some complicating factors|
|5||A more complex change. Requirements are probably understood or there might be dependencies outside the data-team.|
|8||A complex change, that will involve much of the codebase or will require lots of input from others to determine the requirements.|
|13||A significant change that has dependencies and we likely still don't understand all of the requirements. It's unlikely we would commit to this in a milestone, and the preference would be to further clarify requirements and/or break into smaller Issues.|
Think of each of these groups of labels as ways of bucketing the work done. All issues should get the following classes of labels assigned to them:
Optional labels that are useful to communicate state or other priority
State (Red) (Won't Do, Blocked, Needs Consensus, etc.)
Inbound: For issues created by folks who are not on the data team; not for asks created by data team members on behalf of others
Ideally, your workflow should be as follows:
cc @userin a comment.
WIP:label, mark the branch for deletion, mark squash commits, and assign to the project's maintainer. Ensure that the attached issue is appropriately labeled and pointed.
We encourage everyone to record videos and post to GitLab Unfiltered. The handbook page on YouTube does an excellent job of telling why we should be doing this. If you're uploading a video for the data team, be sure to do the following extra steps:
dataas a video tag