A world class development team of software engineers and managers who make our customers happy when using our product(s). Our products should contain broad rich features, high availability, high quality, fast performance, trustworthy security, and reliable operation.
The development department strives to deliver MRs fast. MR delivery is a reflection of:
The department also focuses on career development and process to make this a preferred destination for high performing software engineers.
We use data to make decisions. If data doesn't exist we use anecdotal information. If anecdotal information isn't available we use first principles.
FY23 begins much like FY22 with much ambition. During FY22, the Development department focused on reliability/security, team, SUS and product. In reliaiblity and security, we added Engineering Allocations and FCL processes to support the improvements. We have improved the number of past due issues in both categories and continue to focus here to reduce further. We have grown the team by a small amount during FY22 while keeping up with attrition. We have set ourselves up for further growth in FY23. We have begun to make further investment in improvements in SUS via component migrations. Lastly, we have improved the product using normal product management and development processes. Customers want to see these improvements as well as improvements in reliaiblity and security.
As we have set ourselves at the end of FY22, FY23 will focus on growth, reliaiblity/security, efficiency, SUS improvements, support of top12 cross-functional initiatives and product.
This year the team will grow more than in FY22. We have hiring goals which look to increase the department by over 20%. We are expected to make hires such that we meet our hiring goals while not going over. We have begun focus here by additional support via our recruiting team and more closely tracking the pipelines. Each sub-department will set goals for their teams to hire based on planned headcount increases. This will increase our investment in existing and new areas.
As in late FY22, we will continue to focus on reliability and security challenges in scaling our product. We have focused on past due issues in both these areas. Currently our goal is to reduce these areas to 0 where possible which means we are keeping up with security and reliability challenges. For Q1 our goal will be to reduce past due infradev issues to below 2 and past due security issues to below 50. These targets will continue to be lowered in later quarters.
As part of FY23, we want to see our team grow and continue to improve our efficiency. We have seen a recent dip in our MR Rate for the organization. We will work on improving this and training our new hires on what it means to be iterative. Our goal is to improve from our currently levels back to 8 MRs per development team member.
User experience is a continued focus area for FY23 working on the accomplishments of FY22. We support this effort both in the product development as well as in our architecture. This includes continued conversion of Pajamas components in order to continue to improve performance experienced by users. We have moved to a cadence of completing X issues per quarter. Q1 our goal is 300 and this will continue throughout the year. In parallel we will work on a SUS issues burndown. We are starting with the goal of resolving all S1s in Q1. We will continue this work into subesequent quarters on reduction in all severities.
The development team is one of the larger departments in the company and in our role of product development we impact a large number of cross-organizational efforts. We are involved in at 8 of the 12 top-12 cross-functional initiatives for FY23. We will support these initiatives both there direct OKR efforts as well as normal product process support. Our goal is to balance these initiatives along normal prioritization to attempt to make them all successful in the timeframes requested.
Lastly, we will continue our strong partnership with Product to make GitLab the best, most complete DevSecOps platform on the planet. While we continue adding features to the product we must also work to identify technical debt and bring it to the prioritization discussion. We expect that Engineering managers are already addressing technical debt that is group specific with their Product Manager.
As part of the feedback from our CultureAmp survey, we heard that we need to focus on more geo diverse behaviors. As a company we need to offset theatre focus which can happen due to some geographies being higher represented in the company. We have started an OKR initiative to address this and will consider it in the coming quarters of the year.
The development team is responsible for developing products in the following categories:
The following people are permanent members of the Development Department:
The following members of other functional teams are our stable counterparts:
Person | Role |
---|
This is the breakdown of our department by section and by stage.
This is the stack-up of our engineers, by level.
Aligned with the company-wide promotion cadence, Development utilizes a quarterly process to collect, validate, approve, review all promotion proposals prior to them being added via the company-wide process. The goal of this quarterly promotion projection and review is to:
Development adheres to the company-wide quarterly timeline outlined here as our SSOT.
The Development Department has an additional formal step built in to our promotion process beyond what the company is currently adhering to through our peer review process. Ahead of the commencement of the Calibration stage of our process, all promotion documents should be peer reviewed by a Senior Manager or Director. The due date to complete the peer review is before the scheduled Calibration session.
FY'23 Calibration sessions:
Calibration session attendees are the following team members: Senior Managers, Directors, Sr. Directors, VP, and Development's aligned People Business Partner. Leaders are welcome to conduct Calibration sessions prior to the scheduled sessions above with their sub-departments as well (though this is not a requirement).
We use a peer review process to collaborate on proposed promtions.
Company-wide guidelines on the Talent Assessment can be found here. The company timeline for the process remains SSOT, the guidelines are below are meant to:
We will be reviewing outliers for Performance/Growth and anyone identified as Key Talent for calibration this cycle. Formal calibration will take place at the Senior Manager, Director, and VP levels. The thought process around who qualifies as an "outlier" for Performance and Growth is outlined here.
Calibration and assesment are two different steps in the process. The assessment phase is the process of assessing each team member to determine their Performance and Growth, whereas the calibration phase (occurring after initial assessments are made) is when management calibrates across the stage/org/level/etc. to discuss and align on assessments. Every team member should be assessed and have supporting points to justify those assessments - but to make calibration sessions more focsued and scalable, we focus on outliers.
For Development specifically, we will calibrate anyone in Boxes, 1, 2, 3, 7, 8, 9 aligned with company-wide guidelines. It's important to note that while these will be our focus areas for calibration sessions, managers should feel free to raise any team member's assessment up for discussions if they have any questions or concerns. Calibrating outliers is not a limitation, but rather a structural adjustment to ensure this process is scalable and focused.
Note: If individual teams want to calibrate every individual, they have the ability to do this/organize/structure separately, but the due dates remain in place across the department to ensure we have enough time to review and calibrate at the various levels in the company.
For team members who have assumed an Acting or an Interim role, we will assess team members aligned with their permanent positions (I.E. not the Acting or Interim position).
As the Talent Assessment impacts compensation, and Acting/Interim periods are not permanent, in the instance that a team member does not end up moving into the Acting/Interim role permanently, we would not want to have their compensation impacted by a temporary position.
In addition to the calibration session pre work on the Talent Assessment page, we ask that you review the following guidelines:
Performance and Growth assessments need to be completed and added to the session agenda doc at least 3 business days before the live calibration session. Each individual should have at least 3 supporting points for the assessment under each pillar (Performance and Growth) added to the agenda doc to help support the “why” behind the assessment.
Key Talent assessment need to be completed and added to the session agenda doc at least 3 business days before the live calibration session. If an individual is indicated as key talent, an explanation should be added to indicate how this individual qualifies as key talent against our key talent definition. Reminder that the bar for key talent is set high, and that key talent makes up roughtly ~10% of the entire population. In Development, we will be assessing Key Talent from the Senior Manager level and above aligned with guidelines, meaning that while everyone in the organization is eligible to be identified as Key Talent, Senior Managers+ will be assessing and making these initial nominations. The rationale behind this decision is that is important to have a holistic view of all team members when determining who meets the key talent criteria, which is why we require a certain scope when assessing key talent in the organization.
Every session attendee should review the Performance/Growth assessments and Key Talent overviews for outliers asynchronously ahead of the session to be prepared for live discussion/calibration.
Important: Be sure to reference our Resources for different templates and material to help with the assessment and calibration processes. In particular, we ask that all managers leverage the Talent Assessment Calibration Spreadsheet template provided to ensure a consistent format for Performance/Growth assessment calibration and finalizing assessments so we have a format that is compatible with uploading directly to BambooHR to minimize error. Note that this sheet also includes a couple of columns for Key Talent assessment and nominations. Given that not all levels of management assess Key Talent, this will not apply for all people managers. If Key Talent is not assessed at your level, feel free to simply remove these columns.
We will calibrate by level within the Development department as a whole to ensure we have consistency and visibility across sub-departments.
Session Number | Attendees | Calibration Level Focus | Session Date | Timezone Alignment | Duration |
---|---|---|---|---|---|
Session 1 | Senior Managers+ (people managers only) + PBP | Intermediate level outliers | 2021-11-10 | EMEA/Americas | 1.5 hours |
Session 2 | Senior Managers+ (people managers only) + PBP | Intermediate level outliers | 2021-11-10 | APAC/Americas | 1 hour |
Session 3 | Senior Managers+ (people managers only) + PBP | Senior level outliers | 2021-11-12 | EMEA/Americas | 2 hours |
Session 4 | Senior Managers+ (people managers only) + PBP | Senior level outliers | 2021-11-15 | APAC/Americas | 2 hours |
Session 5 | Senior Managers+ (people managers only) + PBP | Senior level outliers | 2021-11-18 | EMEA/Americas | 1 hour |
Session 6 | Senior Managers+ (people managers only) + PBP | Staff/EM/Principal/Distinguished level outliers | 2021-11-18 | EMEA/Americas | 1.5 hours |
Session 7 | Senior Managers+ (people managers only) + PBP | Staff/EM/Principal/Distinguished level outliers | 2021-11-18 | APAC/Americas | 1.5 hours |
Session 8 | Director+ (people managers only) + PBP | Calibrate Senior Manager level outliers | 2021-11-19 | EMEA/Americas | 1 hour |
Session 9 | VP and PBP | Calibrate Director/Senior Director and Engineering Fellow level outliers | 2021-11-29 | Americas | 1.5 hours |
Below are level-specific calibration due dates for the Development Department.
Welcome to GitLab! We are excited for you to join us. Here are some curated resources to get you started:
To support GitLab's long-term product health and stability, teams are asked to plan their milestones with an appropriate mix of type::feature
, type::maintenance
, and type:bug
work. Ratios may differ between teams as well as with the same team over time. Factors that influence what ratio is appropriate for a given team include the age of the team, the area of the product they are working in, and the evolving needs of GitLab the business and GitLab the product. If your team does not have enough historical data to know its ratios or you are unsure what an appropriate ratio might be, use a guideline of 60% feature, 30% maintenance, and 10% bugs.
For more details on these three work types, please see the section on work type classification.
Our backlog should be prioritized on an ongoing basis. Prioritization will be done via quad planning (collaboration between product, development, quality, UX) with a DRI to be responsible for the decisions based on each work type:
type::feature
issuestype::maintenance
issuestype::bug
issuesThe DRIs of these three core areas will work collaboratively to ensure the overall prioritization of the backlog is in alignment with section direction or any other necessary product and business needs. If a team is not assigned a Product Designer then there is no UX counterpart needed for prioritization purposes.
It is recommended that teams use a Cross-functional Prioritization Board like this example which provides columns for type::feature
, type::maintenance
, and type::bug
issues. Issues may be reordered by drag and drop.
Note: Each team is encouraged to create their own board as the example board above belongs to the Threat Insights team. Please do not modify this board unless you are a member of the Threat Insights team.
Drag and drop reordering is also supported in the issues list by sorting by Manual
(example). You may find this view to be more effective when focusing on a specific type, or when working against large backlog. When you adjust the order of issues in the Manual list view, it's automatically reflected in the board view, so the order is consistent between both views.
Notes:
UX
that aren't relevant to implementation issues.The Product Manager is responsible for planning each milestone. Product Managers are also responsible for ensuring that their team's target ratios are maintained over time.
It is recommended to use the your team's same Cross-functional Prioritization board for milestone planning.
Add the milestone
(example) to review the milestone plan. The board will show the number of issues and cumulative issue weights for type::feature
, type::maintenance
, and type::bug
issues.
The primary goals of this review exercise is for teams to:
undefined
MRs under 5% of all MRs merged for a given calendar month, where undefined
MRs refers to any MRs without a type::
labelThese reviews will use cross-functional dashboards embedded on each team's handbook page that serve as the SSOT when reviewing type::
labels of merged MRs.
The cadence and attendees for reviews varies at each level.
Note that the review collaboration can be done in a way that's most effective for the team, either synchronously (e.g. scheduled recurring call) or asynchronously (e.g. issues), as long as the previous reviews are well documented (with historical tracking).
Who participates?
Questions to answer
type
label, or leveraging the /copy_metdata
command on merged MRs.)Who participates?
Questions to answer
Who participates?
Questions to answer
Issues that impact code in another team's product stage should be approached collaboratively with the relevant Product and Engineering managers prior to work commencing, and reviewed by the engineers responsible for that stage.
We do this to ensure that the team responsible for that area of the code base is aware of the impact of any changes being made and can influence architecture, maintainability, and approach in a way that meets their stage's roadmap.
At times when cross-functional, or cross-departmental architectural collaboration is needed, the GitLab Architecute Evolution Workflow should be followed.
At GitLab we value freedom and responsibility over rigidity. However, there are some technical decisions that will require approval before moving forward. Those scenarios are outlined in our required approvals section.
Development's headcount planning follows the Engineering headcount planning and long term profitability targets. Development headcount is a percentage of overall engineering headcount. For FY20, the headcount size is 271 or ~58% of overall engineering headcount.
We follow normal span of control both for our managers and directors of 4 to 10. Our sub-departments and teams match as closely as we can to the Product Hierarchy to best map 1:1 to Product Managers.
While we try to work as much as possible async, the Development department leadership does meet synchronously on a cadence of weekly. This meeting coordinates initiatives, communicates relevant information, discusses more difficult decisions, and provides feedback on how we are progressing as an organization. As part of this meeting, we discuss our culture of reliability monthly. This was part of the agenda spawned from an initiative we took up in August of 2021. We want to make sure we keep the organization healthy when thinking about reliability in every part of our work.
This section applies to those who report to the VP of Development
The following is a non exhaustive list of daily duties for engineering directors, while some items are only applicable at certain time, though.
The schedule below covers the period from 2021-12-20 to 2022-01-03. To reach the emergency contacts, please send Slack as well as Text(when possible) messages for timely responses. The phone numbers can be found in Slack profiles.
Name | Time Zone | Available Dates |
---|---|---|
Chun Du | GMT-8 (Portland, OR) | December 20 - January 3 |
Craig Gomes | GMT-8 (Portland, OR) | December 20 - January 3 |
Sam Goldstein | GMT-8 (Portland, OR) | December 20 - January 3 |
Wayne Haber | GMT-5 (NYC) | January 3 |
Todd Stadelhofer | GMT-8 (Portland, OR) | December 27 - January 3 |
Christopher Lefelhocz | GMT-6 (San Antonio, TX) | 2021-12-20 - 2021-12-22, 2021-12-25 - 2021-12-29, 2022-01-02 |
Darva Satcher | GMT-5 | December 24,27,31 |
Phil Calder | GMT+13 (Wellington, New Zealand) | December 20 - December 23 |
Tim Zallmann | GMT+1 (Vienna, Austria) | December 20 - December 23 |
In general, OKRs flow top-down and align to the company and upper level organization goals.
For managers and directors, please refer to a good walk-through example of OKR format for developing team OKRs. Consider stubbing out OKRs early in the last month of the current quarter, and get the OKRs in shape (e.g. fleshing out details and making them SMART) no later than the end of the current quarter.
It is recommended to assess progress weekly.
Below are tips for developing individual's OKRs:
Engineering Allocation require us to track goals with more diligence and thought. We need confidence that we’re making correct decisions and executing well to these initiatives. As such, you will see us reviewing these more closely than other initiatives. We will meet on a cadence to review these initiatives and request additional reporting to support the process. Possible requests for additional data:
We will hold Engineering Allocation Checkpoints on a cadence. The recommended cadence is weekly.
We track Engineering Allocation roadmaps. To use this effectively, roadmaps must have correct dates for their epic and weights assigned to issues. If a team does not normally use weights, then assign each issue a weight of 1 (all issues are equal).
Each team needs to demonstrate how there allocation is being used. This is done to verify we are not over/under investing for a given initaitive. This can be done via assignment (people assigned to work) and/or issues assigned. We will track issues and MRs and see as a percentage how that compares to the overall teams work.
The GitLab application is built on top of many shared services and components, such as PostgreSQL database, Redis, Sidekiq, Prometheus and so on. These services are tightly woven into each feature's rails code base. Very often, there is need to identify the DRI when demand arises, be it feature request, incident escalation, technical debt, or bug fixes. Below is a guide to help people quickly locate the best parties who may assist on the subject matter.
There are a few available models to choose from so that the flexibility is maximized to streamline what works best for a specific shared service and component.
The shared services and components below are extracted from the GitLab product documentation.
Service or Component | Sub-Component | Ownership Model | DRI (Centralized Only) | Ownership Group (Centralized Only) | Additional Notes |
---|---|---|---|---|---|
Alertmanager | Centralized with Specific Team | @mendeni | Distribution | Distribution team is responsible for packaging and upgrading versions. Functional issues can be directed to the vendor. | |
Certmanager | Centralized with Specific Team | @mendeni | Distribution | Distribution team is responsible for packaging and upgrading versions. Functional issues can be directed to the vendor. | |
Consul | |||||
Container Registry | Centralized with Specific Team | @dcroft | Package | ||
Email - Inbound | |||||
Email - Outbound | |||||
GitLab K8S Agent | Centralized with Specific Team | @nicholasklick | Configure | ||
GitLab Pages | Centralized with Specific Team | David O'Regan @oregand | Editor | ||
GitLab Rails | Decentralized | DRI for each controller is determined by the feature category specified in the class. app/controllers and ee/app/controllers | |||
GitLab Shell | Centralized with Specific Team | @sean_carroll | Create:Source Code | Reference | |
HAproxy | Centralized with Specific Team | @sloyd | Infrastructure | ||
Jaeger | Centralized with Specific Team | @sloyd | Infrastructure:Observability | Observability team made the initial implementation/deployment. | |
LFS | Centralized with Specific Team | @sean_carroll | Create:Source Code | ||
Logrotate | Centralized with Specific Team | @mendeni | Distribution | Distribution team is responsible for packaging and upgrading versions. Functional issues can be directed to the vendor. | |
Mattermost | Centralized with Specific Team | @mendeni | Distribution | Distribution team is responsible for packaging and upgrading versions. Functional issues can be directed to the vendor. | |
MinIO | Decentralized | Some issues can be broken down into group-specific issues. Some issues may need more work identifying user or developer impact in order to find a DRI. | |||
NGINX | Centralized with Specific Team | @mendeni | Distribution | ||
Object Storage | Decentralized | Some issues can be broken down into group-specific issues. Some issues may need more work identifying user or developer impact in order to find a DRI. | |||
Patroni | General except Geo secondary clusters | Centralized with Specific Team | @mendeni | Distribution | |
Geo secondary standby clusters | Centralized with Specific Team | @nhxnguyen | Geo | ||
PgBouncer | Centralized with Specific Team | @mendeni | Distribution | ||
PostgreSQL | PostgreSQL Framework and Tooling | Centralized with Specific Team | @alexives | Database | Specific to the development portion of PostgreSQL, such as the fundamental architecture, testing utilities, and other productivity tooling |
GitLab Product Features | Decentralized | Examples like feature specific schema changes and/or performance tuning, etc. | |||
Prometheus | Decentralized | Each group maintains their own metrics. | |||
Puma | Centralized with Specific Team | @changzhengliu | Memory | ||
Redis | Decentralized | DRI is similar to Sidekiq which is determined by the feature category specified in the class. app/workers and ee/app/workers | |||
Sentry | Decentralized | DRI is similar to GitLab Rails which is determined by the feature category specified in the class. app/controllers and ee/app/controllers | |||
Sidekiq | Decentralized | DRI for each worker is determined by the feature category specified in the class. app/workers and ee/app/workers | |||
Workhorse | Centralized with Specific Team | @sean_carroll | Create:Source Code |
For a list of resources and information on our GitLab Learn channel for Development, consult this page.
In late June 2019, we moved from a monthly release cadence to a more continuous delivery model. This has led to us changing from issues being concentrated during the deployment to a more constant flow. With the adoption of continuous delivery, there is an organizational mismatch in cadence between changes that are regularly introduced in the environment and the monthly development cadence.
To reduce this, infrastructure and quality will engage development via SaaS Infrastructure Weekly and Performance refinement which represent critical issues to be addressed in development from infrastructure and quality.
Refinement will happen on a weekly basis and involve a member of infrastructure, quality, product management, and development.
Execution of a Global prioritization can take many forms. This is worked with both Product and Engineering Leadership engaged. Either party can activate a proposal in this area. The options available and when to use them are the following:
Because our teams are working in separate groups within a single application, there is a high potential for our changes to impact other groups or the application as a whole. We have to be cautious not to inadvertently impact overall system quality but also availability, reliability, performance, and security.
An example would be a change to user authentication or login, which might impact seemingly unrelated services, such as project management or viewing an issue.
Far-reaching work is work that has wide-ranging, diffuse implications, and includes changes to areas which will:
If your group, product area, feature, or merge request fits within one of the descriptions above, you must seek to understand your impact and how to reduce it. When releasing far-reaching work, use a rollout plan. You might additionally need to consider creating a one-off process for those types of changes, such as:
Some areas have already been identified that meet the definition above, and may consider altered approaches in their work:
Area | Reason | Special workflows (if any) |
---|---|---|
Database migrations, tooling, complex queries, metrics | impact to entire application The database is a critical component where any severe degradation or outage leads to an S1 incident. |
Documentation |
Sidekiq changes (adding or removing workers, renaming queues, changing arguments, changing profile of work required) | impact to multiple services Sidekiq shards run groups of workers based on their profile of work, eg memory-bound. If a worker fails poorly, it has the potential to halt all work on that shard. |
Documentation |
Redis changes | impact to multiple services Redis instances are responsible for sets of data that are not grouped by feature category. If one set of data is misconfigured, that Redis instance may fail. |
|
Protected Branches, CODEOWNERS, MR Approvals, Gitaly interface | high percentage of traffic share | |
Package product areas | high percentage of traffic share | |
Gitaly product areas | high percentage of traffic share | |
Pipeline Execution product areas | high percentage of traffic share | Documentation |
Authentication and Authorization product areas | touch multiple areas of the application | Documentation |
Workspace product areas | touch multiple areas of the application | |
Compliance product areas | potentially have legal, security, or compliance consequences | |
Specific fulfillment product areas | potentially impact revenue | |
Runtime language updates | impacts to multiple services | Ruby Upgrade Guidelines |
Application framework updates | impacts to multiple services | Rails Upgrade Guidelines |
ClickHouse usage by Monitor:Observability group
Note: books in this section can be expensed.
Interested in reading this as part of a group? We occasionally self-organize book clubs around these books and those listed on our Leadership page.