The Product Intelligence Group is part of the Analytics section. Our group focuses on providing GitLab's team with data-driven product insights to build a better GitLab. To do this, we build data collection and analytics tools within the GitLab product in a privacy-focused manner. Insights generated from Product Intelligence enable us to identify the best places to invest people and resources, what product categories mature faster, where our user experience can be improved, and how product changes impact the business. You can learn more about what we're building next on the Product Intelligence Direction page.
How we work:
If you have any questions start by @ mentioning the product manager for the Product Intelligence Group or by creating an issue in our issue board.
Every week Product Intelligence team holds open office hours on Zoom for any questions that might arise. It's typically Thursdays for half an hour and alternates between 8:00 UTC and 15:00 UTC timeslots. You can find the event in the GitLab Team Meetings calendar.
We're responsible to deliver a reliable Service Ping that runs every week on SaaS and Self Managed instances. Our responsiblity is tooling and automations for metric collections to set the company up for success to deliver Service Ping data to our data warehouse. Due to the amount of metrics we can't maintain the health of all metrics or can provide insights into the business logic of metrics.
performance_indicator_type
) we also inform the responsible team but will treat it as a Severity 1/Priority 1 issue and try to provide a fix.The following people are permanent members of the Product Intelligence Group:
Person | Role |
---|---|
Sebastian Rehm | Manager, Fullstack Engineering, Analyze:Product Intelligence |
Jonas Larsen | Senior Backend Engineer, Analyze:Product Intelligence |
Michał Wielich | Backend Engineer, Analyze:Product Intelligence |
Mikołaj Wawrzyniak | Senior Backend Engineer, Analyze:Product Intelligence |
Niko Belokolodov | Senior Backend Engineer, Analyze:Product Intelligence |
Piotr Skorupa | Backend Engineer, Analyze:Product Intelligence |
Our team uses a hybrid of Scrum for our project management process. This process follows GitLab's monthly milestone release cycle.
(Sisense↗) We also track our backlog of issues, including past due security and infradev issues, and total open System Usability Scale (SUS) impacting issues and bugs.
(Sisense↗) MR Type labels help us report what we're working on to industry analysts in a way that's consistent across the engineering department. The dashboard below shows the trend of MR Types over time and a list of merged MRs.
For an overview about the capabilities of the analytic tooling the team develops, you can watch the video Product Intelligence 101, or look through the slides (internal)
Our team use the following workflow stages defined in the Product Development Flow:
Label | Usage |
---|---|
~"workflow::validation backlog" |
Applied by the Product Manager for incoming issues that have not been refined or prioritized. |
~"workflow::problem validation" |
Applied by the Product Manager for issues where the PM is developing a thorough understanding of the problem |
~"workflow::design" |
Applied by the Product Manager or Designer (or Product Intelligence Engineer) to ideate and propose solutions. The proposed solutions should be reviewed by engineering to ensure technical feasibility. |
~"workflow::solution validation" |
Applied by the Product Manager or Designer (or Product Intelligence Engineer) to validate a proposed solution through user interviews or usability testing. |
Label | Usage |
---|---|
~"workflow::planning breakdown" |
Applied by the Product Manager for Engineers to begin breaking down issues and adding estimates. |
~"workflow::ready for development" |
Applied by either Engineering or Product Manager after an issue has been broken down and scheduled for development. |
~"workflow::in dev" |
Applied by the Engineer after work (including documentation) has begun on the issue. An MR is typically linked to the issue at this point. |
~"workflow::in review" |
Applied by the Engineer indicating that all MRs required to close an issue are in review. |
~"workflow::verification" |
Applied by the Engineer after the MRs in the issue have been merged, this label is applied signaling the issue needs to be verified in staging or production. |
~"workflow::complete" |
Applied by the Engineer after all MRs have merged and the issue has been verified. At this step, the issue should also be closed. |
~"workflow::blocked" |
Applied by any team member if at any time during development the issue is blocked. For example: technical issue, open question to PM or PD, cross-group dependency. |
We use an epic roadmap to track epic progress on a quarterly basis. The epic roadmap is a live view of the Product Intelligence Direction page.
To keep things simple, we primarily use the gitlab.com/gitlab-org group for our roadmap. If epics are created on the gitlab.com/gitlab-com and gitlab.com/gitlab-services groups, we create placeholders of them on gitlab.com/gitlab-org so that all epics show up in a single roadmap view.
gitlab-org | gitlab-com | gitlab-services | all groups |
---|---|---|---|
gitlab-org Epic Roadmap | - | - |
We use issue boards to track issue progress on a daily basis. Issue boards are our single source of truth for the status of our work. Issue boards should be viewed at the highest group level for visibility into all nested projects in a group.
We prioritize our product roadmap in the Issue Board by Milestone. Issues appear on each list in order of priority and prioritization of our product roadmap is determined by our product managers.
Engineers can find and open the milestone board for Product Intelligence. Engineers should start at the top of the board and pick the first available, non-assigned issue which is labeled Ready for development
. When picking an issue, the engineer should assign themselves as a signal that they are taking ownership of the issue.
If the next available issue is not a viable candidate (due to amount of capacity vs. issue weight, complexity, knowledge domain, etc.) the engineer may choose to skip an issue in milestone list and pick the next issue in order of priority.
The following table will be used as a guideline for scheduling work within the milestone:
Type | % of Milestone | Description |
---|---|---|
Deliverable | 70% | business priorities (compliance, IACV, efficiency initiatives) |
Tech debt | 10% | nominated by engineers prior to milestone start in Milestone Planning Issue |
Other | 20% | engineer picks, critical security/data/availability/regression, urgent business priorities |
If all work within a milestone is picked, engineers are free to choose what to work on. Acceptable options include:
We follow the iteration process outlined by the Engineering function.
We estimate issues async and aim to provide an initial estimate (weight) for all issues scheduled for an upcoming milestone.
We require a minimum of two estimations for weighing an issue. We consider reacting with a ➕ emoji to the estimation as agreeing with it (and thus contributing to the minimal count of estimations). If both estimations agree, the engineer who did the second estimation should add the agreed-upon weight to the issue. If there is disagreement, the second engineer should @-mention the first one to resolve the conflict.
In planning and estimation, we value velocity over predictability. The main goal of our planning and estimation is to focus on the MVC, uncover blind spots, and help us achieve a baseline level of predictability without over-optimizing. We aim for 70% predictability instead of 90%.
We default spike issues to a weight of 8.
If an issue has many unknowns where it's unclear if it's a 1 or a 5, we will be cautious and estimate high (5).
If an initial estimate needs to be adjusted, we revise the estimate immediately and inform the Product Manager. The Product Manager and team will decide if a milestone commitment needs to be changed.
Issues estimation examples
The following is a guiding mental framework for engineers to consider when contributing to estimates on issues.
### Refinement / Weighing
**Ready for Development**: Yes/No
<!--
Yes/No
Is this issue sufficiently small enough, or could it be broken into smaller issues? If so, recommend how the issue could be broken up.
Is the issue clear and easy to understand?
-->
**Weight**: X
**Reasoning**:
<!--
Add some initial thoughts on how you might break down this issue. A bulleted list is fine.
This will likely require the code changes similar to the following:
- replace the hexdriver with a sonic screwdriver
- rewrite backups to magnetic tape
- send up semaphore flags to warn others
Links to previous example. Discussions on prior art. Notice examples of the simplicity/complexity in the proposed designs.
-->
**Iteration MR/Issues Count**: Y
<!--
Are there any opportunities to split the issue into smaller issues?
- 1 MR to update the driver worker
- 1 MR to update docs regarding mag tape backups
Let me draw your attention to potential caveats.
-->
**Documentation required**: Y/N
<!--
- Do we need to add or change documentation for the issue?
-->
To properly set expectations for product managers and other stakeholders, our team may decide to add a due date onto an issue. Due dates are not meant to pressure our team but are instead used to communicate an expected delivery date.
We may also use due dates as a way to timebox our iterations. Instead of spending a month on shipping a feature, we may set a due date of a week to force ourselves to come up with a smaller iteration.
An OOO coverage process helps reduce the mental load of "remembering all the things" while preparing for being away from work. This process allows us to organize the tasks we need to complete before time off and make the team successful.
Open a new issue in the Product Intelligence project with the out_of_office_coverage_template
.
Our team mostly follows the Product Development Timeline as our group is dependent on the GitLab self-managed release cycle.
The specific application of this timeline to the Product Intelligence Milestone planning process is summarized below.
Phase | Time |
---|---|
Planning & Breakdown Phase | 4th - 17th of month N |
Development Phase | 18th of month N - 17th of month N+1 |
Timeline: 4th - 17th of month N
Tasks:
Timeline: 18th of month N – 17th of month N+1.
Tasks:
Our milestone capacity tells us how many issue weights we can expect to complete in a given milestone. To estimate this we calculate the average daily weight completed by an engineer per day across the previous two milestones. This is multiplied with the actual working days available to us in a given milestone.
Previous Two Milestones:
Next Milestone:
In this example, the next milestone’s capacity is 64 weights for the whole team. Keep in mind that neither estimations nor this calculation are an exact science. The capacity planning is supposed to help the EM and PM set realistic expectations around deliverables inside and outside time. We do not expect to hit the exact amount of predicted weights.
A milestone commitment is a list of issues our team aims to complete in the milestone. The product team follows our GitLab principle of planning ambitiously and therefore expect that we won't always be able to deliver everything that we wanted in every milestone. After issues are broken down, estimated, and prioritized, the product manager will apply the ~Deliverable
label to applicable issues. Issues marked with the ~Deliverable
label represent the commitment we are intending to ship in that milestone.
Per the Next Prioritization initiative, we will review our team's performance in applying appropriate type labels to MRs. At the close of the milestone, on the Planning Issue, the EM or PM will post a link to this dashboard along with a summary of shipped work by type label (include null) to ensure we are observing the recommended work split of 60% feature, 30% maintenance, 10% bugs, and <=5% undefined.
In Product Intelligence, determining if work is applicable to ~type::maintenance or ~type::feature is not readily apparent. As a guide, we denote work which benefits the Product Intelligence team and technical processes as ~type::maintenance whereas work which benefits GitLab customers or team members is considered ~type::feature.
To help our team be efficient, we explicitly define how our team uses epics and issues.
We aim to create issues in the same project as where the future merge request will live. And we aim to create epics at the topmost-level group that makes the most sense for its collection of child epics and issues. For example, if an experiment is being run in the CustomersDot, the epic should be created in the gitlab-org
group, and the issue should be created in the gitlab-org/customers-gitlab-com
project.
We emphasize creating the epic at the topmost-level group so that it will show up on our epic roadmap. And we emphasize creating the issue in the right project to avoid having to close and move it later in the development process.
We used to aim for a 1:1 ratio between issues and merge requests, mainly for the sake of status visibility at the issue board level. We have since moved to using epics and the epic roadmap for product management visibility, and we are comfortable with the amount of status updates received during our weekly sync meetings as well as through comments within issues themselves.
If an issue requires multiple merge requests, we no longer recommend splitting the issue itself up in order to maintain a 1:1 ratio of issues to MRs. The advantage is that an engineer is able to create an arbitrary number of MRs for a single issue and can move much more quickly through them. The trade-off is that doing so makes it more difficult to communicate the overall status of the issue itself. It is the engineer's responsibility to make sure that the status of each issue they are working on is effectively communicated to their Product Manager.
We group related issues together using parent epics and child epics, providing us with a better visual overview of our roadmap.
We use issue labels to keep us organized. Every issue has a set of required labels that the issue must be tagged with. Every issue also has a set of optional labels that are used as needed.
Required labels
~devops::analytics
~group::product intelligence
~"workflow::planning breakdown
, ~"workflow::ready for development
, ~"workflow::in dev
, etc.~"type::bug"
, ~"type::feature"
, ~"type::tooling"
, ~"type::maintenance"
MR labels should mirror issue labels (which is automatically done when created from an issue):
Required labels
~section::analytics
~group::product intelligence
~"type::bug"
, ~"type::feature"
, ~"type::tooling"
, ~"type::maintenance"
We tag each issue and MR with the planned milestone or the milestone at time of completion.
Our group holds synchronous meetings to gain additional clarity and alignment on our async discussions. We aim to record all of our meetings as our team members are spread across several timezones and often cannot attend at the scheduled time.
We like to share knowledge and learn! If your group would like someone from the Product Intelligence group to attend a sync call and provide a brief overview of our responsibilities and scope, please open an issue and apply the ~group::product intelligence
label (example issue).
In the same spirit, we want to learn more about the different teams at GitLab. We have an ambitious goal of hosting a guest speaker once monthly in our team meeting for a 15-30 minute session (whatever works for you). If you'd like to participate in sharing information with our team, please comment in our slack channel #g_product_intelligence
Here are some topics to help create your session material:
If you would like to propose a new knowledge session for a topic you want to learn more about, open an issue in Product Intelligence and provide the details. Issue 603 gives you a good example of how this is done.
Date | Topic | Speaker |
---|---|---|
2022-08-16 | Usage of Service Ping data | Jay Stemmer |
We maintain UsageData API endpoints under the service_ping
feature to track events, and because of this we must monitor our budget spend.
To investigate budget spend, see the overview and details Grafana dashboards for Product Intelligence. You can also check requests contributing to spending the budget in Kibana by filtering by the service_ping
feature. An example Kibana view can be found here.
Note that the budget spend is calculated proportionally by requests failing apdex or failing with an error, and not by how much the target is exceeded. For example, if we had an endpoint with a set goal of 1s request duration, then bringing the request duration from 10s to 5s would not improve the budget.
Dbt, short for data build tool, is an open source project for managing data transformations in a data warehouse. The data warehouse and dbt are managed by the Data team. In order to reduce cycle time, increase understanding, and enable the Product Intelligence team to fully own collected metrics, the Product Intelligence team should be empowered to develop and modify data models that represents Product Intelligence metrics.
@
mention a person in a role of Senior Manager Data at Gitlab (@dvanrooijen2
currently). For example: database setup issue.Please follow the configuration guide provided by Data team.
If you've been granted access to the Snowflake data warehouse via Okta SSO, you will not be able to use Docker to set up dbt. Instead, follow the venv workflow. You will also need to alter profiles.yml
.
replacing password: YOUR PASSWORD
with authenticator: externalbrowser
.
When contributing to the analytics repository, please follow the style and development guides.
This internal video contains a condensed introduction to dbt.
A high level overview of the contribution process is as follows:
models/legacy/snowplow/month_partition/base
.snowplow_gitlab_events_standard_context.sql
.schema.yml
file within the model's parent folder (the one from step #1).$ dbt run -m model_name
. It runs models and outputs SQL representation within the target
directory at the location corresponding to the model location, for example: target/compiled/gitlab_snowflake/models/legacy/snowplow/month_partition/base/snowplow_gitlab_events_standard_context.sql
.➕🐭specify_model
job within the ⚙️ dbt Run
stage. You have to specify the models to test by providing proper selector with DBT_MODELS
job environment variable.@mpeychet_
) for a review.All new team members to the Product Intelligence teams are provided an onboarding issue to help ramp up on our analytics tooling. New team member members should create their own onboarding issue in the gitlab-org/product-intelligence project using the engineer_onboarding
template.
Resource | Description | |
---|---|---|
Product Intelligence Guide | A guide to Product Intelligence | |
Getting started with Product Intelligence | ||
Service Ping Guide | An implementation guide for Service Ping | |
Snowplow Guide | An implementation guide for Snowplow | |
Metric Dictionary | A SSoT for all collected metrics and events | |
Privacy Policy | Our privacy policy outlining what data we collect and how we handle it | |
Product Intelligence Direction | The roadmap for Product Intelligence at GitLab | |
Product Intelligence Development Process | The development process for the Product Intelligence group | |
GitLab Performance Snowplow Dashboards | Performance dashbaords for GitLab.com via Snowplow | |
FAQ | A list of questions and answers related with Service Ping and Snowplow |