The Gitaly team is responsible for building and maintaining systems to ensure that the Git data storage tier of GitLab instances, and GitLab.com in particular, is reliable, secure and fast. For more information about Gitaly, see our Direction page and roadmap.
While GitLab is the primary consumer of the Gitaly project, Gitaly is a standalone product which can be used external to GitLab. As such, we strive to achieve a functional boundary around Gitaly. The goal of this is to ensure that the Gitaly project creates an interface to manage Git data, but does not make business decisions around how to manage the data.
For example, Gitaly can provide a robust and efficient set of APIs to move Git repositories between storage solutions, but it would be up to the calling application to decide when such moves should occur.
Processes fully independent of business inputs (such as repository maintenance) should be fully contained within Gitaly as they provide substantial value to anyone using the Gitaly project.
The following members of other functional teams are our stable counterparts:
|Mark Wood||Senior Product Manager, Systems:Gitaly|
|Costel Maxim||Senior Security Engineer, Application Security, Plan (Project Management, Product Planning, Certify), Create:Source Code, Growth, Fulfillment:Purchase, Fulfillment:Provision, Fulfillment:Utilization, Systems:Gitaly|
|Furhan Shabir||Senior Site Reliability Engineer, Systems:Gitaly|
|Gerardo Gutierrez||Senior Support Engineer, Systems:Gitaly|
|Igor Drozdov||Staff Backend Engineer, Create:Source Code, Systems:Gitaly API|
|John McDonnell||Senior Software Engineer in Test, Systems:Gitaly|
|Steve Azzopardi||Senior Site Reliability Engineer, Systems:Gitaly|
|Vasilii Iakliushin||Senior Backend Engineer, Create:Source Code, Systems:Gitaly API|
Gitaly team members do not carry pagers, but we live around the world and there's a good chance that someone is available during their working hours. There is no coverage for weekends; instead, we strive to empower incident responders to mitigate any circumstance.
These issues relate to ongoing production outages or similar. They interrupt our process used to [schedule work] and get attention as soon as possible. Please only interrupt us sparingly, in these cases:
Getting attention on an urgent, interrupting issue
@gl-gitaly(the whole team) on the issue.
To get Gitaly team work on something, it's best to create an issue on the Gitaly issue tracker
and add the
workflow::problem validation labels,
along with any other appropriate labels. Then, feel free to tag the relevant
Product Manager and/or Engineering Manager as listed above.
For information requests and other quick one-offs, feel free to use #g_gitaly on Slack to get attention on the issue.
These are typically Corrective Actions or other followup items that have strict SLO tracking. They will be scheduled through either of the above paths, by EM and/or PM polling these dashboards:
Mission: Provide a durable, performant, and reliable Git storage layer for GitLab.
|Andras Horvath||Manager, Engineering, Gitaly|
|James Fargher||Senior Backend Engineer, Gitaly|
|Justin Tobler||Backend Engineer, Gitaly|
|Karthik Nayak||Senior Backend Engineer, Gitaly|
|Patrick Steinhardt||Staff Backend Engineer, Gitaly|
|Pavlo Strokov||Senior Backend Engineer, Gitaly|
|Quang-Minh Nguyen||Senior Backend Engineer, Gitaly|
|Sami Hiltunen||Senior Backend Engineer, Gitaly|
|Will Chandler||Senior Backend Engineer, Gitaly|
Mission: Develop Git in accordance with the goals of the community and GitLab, and integrate it into our products.
|Ævar Arnfjörð Bjarmason||Git contractor at GitLab|
|Christian Couder||Senior Backend Engineer, Gitaly|
|John Cai||Interim Engineering Manager, Git|
|Patrick Steinhardt||Staff Backend Engineer, Gitaly|
|Toon Claes||Senior Backend Engineer, Gitaly|
We generally follow the Product Development Flow to schedule and track our work.
Work is executed in small chunks (2-3 days of work), each tracked as an issue. This allows for natural "checkpoints" for safe context switching. Triaging and scheduling is separate from executing the current work. All incoming work is tracked and we are intentional about picking up new work.
Incoming work of all kind (both projects and ad-hoc interrupts) passes by EM and PM for triage. There may be some engineering consultation here about feasibility, fit with the product's strategy roadmap etc. Some will get scheduled, some goes to the backlog. If the effort is not deemed necessary or not believed to align with the roadmap, we will close the issue with commentary as to why it is not being pursued for future reference.
We aim to scope milestones such that we have a task list that is ambitious, but not overwhelming. We deliberatly leave some capacity for incoming incidents. We want to avoid the feeling of a never ending mountain of work to promote a healthy work / life balance. It is also important to stress, that milestones are recommendations only and we work on the best effort basis.
For issues with a strict SLO, we follow the process defined below
We use the following workflow labels on the issues:
workflow::problem validation- A good spot to put features that we may / may not want to pursue. This is where product can do some user interviews, cost analysis, market fit, etc to decide if it's an opportunity we wish to pursue.
workflow::solution validation- Use this label for features / issues where Engineering needs to investigate / propose a solution going forward, or break it down into smaller issues.
workflow::planning breakdown- Issues ready to be scheduled in the next few milestones (unblocked or soon unblocked, with a known solution). Leaders of long-running (pre-approved) projects use this to communicate with PM.
workflow::ready for development- Work that is scheduled for a milestone (either the current one, or one in the future).
workflow::in dev- Actively being worked by the Engineering team
workflow::in review- Work that is in review
workflow::verification- code is in production and pending verification by the DRI engineer
workflow::complete- changes are verified, issue can be closed
Issues scheduled for the milestone (with
workflow::ready for development and a specific release milestone assigned)
can be further prioritized by moving them up or down on the dashboard.
Issues that we definitely want to prioritize for a release receive a
Deliverable label and are moved to the top of the list.
Deliverable issues help show our commitment to GitLab and our customers around working on these issues.
Engineers ready to pick up more work do not necessarily need to assign the topmost item, but rather make an informed choice with affinity (area of expertise, relative urgency, interest etc).
Everyone can file new issues as more work is discovered, and feed them into this process.
Note that P1/S1 work should be the only one to preempt this default flow. Do involve PM and EM if urgent work needs to be prioritized.
A weekly call is held between the product manager and engineering managers (of both Cluster and Git teams). Everyone is welcome to join and these calls are used to discuss any roadblocks, concerns, status updates, deliverables, or other thoughts that impact the group.
OKR planning is done before every quarter, for the next 3 milestones. At the time, we must have a good idea of the work that needs to be done. The process is as follows:
EM+PM (with input from engineers and stakeholders): decide the scope we'll be working on.
EM: Tie the issues to KRs in Ally.io. This is mostly for reporting/communication of what we want to work on. Where possible, align with the larger organization's objectives.
PM: Once the scope of the quarter is clear, take the list of issues and assign one of the three milestones, along with
workflow::planning breakdown (for large issues in need of breakdown) or
workflow::ready for development.
workflow::planning breakdownitems and file smaller issues if needed, adding them to the same 3 milestones as reasonable. Raise exceptions as needed.
Infradev label are typically Corrective Actions or other followup items that have strict
SLO tracking. They will be scheduled through either of the above paths, by EM
and/or PM polling these dashboards:
EM+PM: Poll the dashboards at least weekly. Triage and schedule these issues so that SLOs can be met. If needed, move the issue to the Gitaly tracker, or file a proxy issue there so that it shows up on work boards, and mark it as blocking. Drag issues to the top of the workflowready for development column.
EM+PM: If the issue is blocked or depends on ongoing work, add a Milestone that fits the SLO and the pending work (so that we don't forget it). Ensure that blocking work gets scheduled before.
Engineers: please prioritize picking up this work, and post frequent (at most weekly, even if no changes) updates in the original issue. Mark any blocking issues as such.
To have a constant communication flow about planned changes, updates and maybe breaking changes we have the #g_gitaly Slack channel. In the channel we will provide updates for all teams using the service but also ask for assistance to provide feedback and insights about planned changes or improvements.
To support this pro-active communication additionally there is also an individual counterpart on the consumer side to help with research in the codebases and coordination with all the teams consuming Gitaly. The DRI on Consumer side is Igor Drozdov.
The Gitaly consumers are:
At the beginning of each release, the Gitaly EM will create a retrospective issue to collect discussion items during the release. The first weekly Gitaly meeting after the 18th that issue will be used to discuss what was brought up.
(Sisense↗) MR Type labels help us report what we're working on to industry analysts in a way that's consistent across the engineering department. The dashboard below shows the trend of MR Types over time and a list of merged MRs.
To complete team-specific onboarding, please file an issue here.
Maintainer rights are revoked, and to remove the developer from the list of
authorized approvers, remove them from the
gl-gitaly GitLab.com group.