GitLab Values have guided us throughout the evolution of the company. Those values have been crucial in maintaining a positive and productive culture, helping us make decisions to make the company and the product better.
Our Engineering Principles are built on top of GitLab Values, and provide additional explanation of these in the context of the software engineering practice.
We always push ourselves to be iterative and make the minimal viable change that is on the direct path to achiving our goals.
For complex initiatives we are using the Architecture Design Workflow to define our goals and to describe our iteration plan for a given effort. How we are going to iterate on something may depend on our mid-to-long-term vision. Our goal is to make the iteration as efficient and as pragmatic as possible.
We also sometimes use an established "scooter heuristic": release a smallest product that our customers can use, and then iterate on it.
Image Credit: Henrik Kniberg from Crisp
Releasing minimal changes, as quickly as possible, is sometimes a good way to think about iteration, especially when intermediate steps provide value to our users and allow us to collect meaningful feedback. This has often been our default iteration approach.
This "scooter" iteration pattern, however, is not always the most efficient way to make progress on something. Sometimes we do need to have a tailored plan for efficient iteration, because if we want to build something more complex, our users may not be happy with a scooter or a paper airplane, and we will either not get any meaningful feedback or it will not be significant enough to be useful. In such cases, how we iterate should be a diligently planned strategy, designed to improve efficiency and to allow us to achieve our goals within a reasonable amount of time.
At GitLab we can use our Architecture Design Workflow to build a more pragmatic iteration plan aligned with mid-to-long term vision. This usually covers building a Minimal Viable Product too. In this case, however, how the MVP is going to look like depends on a thoughtful plan, described in writing after collecting feedback throughout the organization.
Below you can find a few patterns that can precede "building the smallest thing in the next milestone" pattern, aimed to increase iteration efficiency:
All these things attempt to improve our level of knowledge and reduce implementation risk. All of these steps are also iterations by themselves that can help us find hidden efficiencies to build better MVPs.
Efficiency is at the core our engineering discipline at GitLab. This is also one of the most important company values.
How we work in Engineering can have a significant compounding effect over the mid-to-long term. If we constantly ask ourselves, "How can we implement this thing to become more efficient in mid-to-long term?" while still thinking about pragmatic short-term goals, we can hope for seeing a lot value being built over time, similarly to how compound interest builds wealth. Overfocusing on short-term goals can lead to us missing out on this opportunity.
A short, and incomplete list of things that can help us become more efficient:
Simplicity is a prerequisite for reliability. It is also at the core our boring solutions value. We strive for choosing simple solution over easy solution, even when it sometimes means that we will need to put more work into building a simple solution.
Availability/Reliability, Quality, Security, and Performance are the pillars for building reliable software. Reliability is our contract with our customers that say you can count on us to deliver an available and dependable product. Everyone in the organization has a role to play.
Engineers, Product Managers, and Designers have the most direct influence over the reliability of the code through either planning, implementation, monitoring (e.g. Kibana, Sentry, Grafana and other Gitlab.com monitoring tools), or prioritization of the work. Product and Engineering management monitors (e.g. Error Budgets) and measures the reliability of features and makes recommendations if necessary. Our focus on learning and development will also ensure that teams have the tools and training required to build reliable software. The Infrastructure, Application Security, Database and Quality teams are the Subject Matter Experts supporting product development teams.
Our velocity should be incremental in nature. It's derived from our MVC-based approach, which encourages "delivering the smallest possible solution that offers value to our users". This could be a small new feature, but also includes code improvements, bug fixes, etc.
To measure this, we count and define the target here: Development Department Narrow MR Rate which is a goal for managers and not ICs. Historically, we have seen this as high as 11.5 Development Department Narrow MR Rate.
For example, an MR rate of 11 translates to roughly one MR every 1½ business days with time for overhead. To attain this, Product Development Engineers are encouraged to:
We optimize for shipping a high volume of user/customer value with each release. We do want to ship multiple major features in every monthly release of GitLab. However, we do not strive for predictability over velocity. As such, we eschew heavyweight processes like detailed story point estimation by the whole team in favor of lightweight measurements of throughput like the number of merge requests that were included or rough estimates by single team members.
There is variance in how much time an issue will take versus what you estimated. This variance causes unpredictability. If you want close to 100% predictability you have to take two measures:
Both measures reduce the overall velocity of shipping features. The way to prevent this is to accept that we don't want perfect predictability. Just like with our OKRs, which are so ambitious that we expect to reach about 70% of the goal, this is also fine for shipping planned features.
Note: This does not mean we place zero value on predictability. We just optimize for velocity first.
When changing an outdated part of our code (e.g. HAML views, jQuery modules), use discretion on whether to refactor or not. For long term maintainability, we are very interested in migrating old code to the consistent and preferred approach (e.g. Vue, GraphQL), but we're also interested in continuously shipping features that our users will love.
Aim to implement new modules or features with the preferred approach, but changing preexisting non-conforming parts is a gray area.
If the weight of refactoring and other constraints (such as time) risk threatening the availability of a feature, then strongly consider refactoring at another time. On the other hand, if the code in question has hurt availability or poses a threat to it, then strongly consider prioritizing refactoring. This is a balancing act and if you're not sure where your change should go (or whether you should do some refactoring before hand), reach out to another Engineer or Maintainer.
If it makes sense to refactor before implementing a new feature or a change, then please:
If it is decided not to refactor at this moment, then please:
Please see the Product Management section that governs how they prioritize work, and also should guide our technical decision making.
|3*||Resilience, Reliability, Availability, and Performance||
|7||xMAU / ARR Drivers||
|8||All other items not covered above|
*indicates forced prioritization items with SLAs/SLOs
Any of the items with a "*" are considered issues driven by the attached SLO or SLA and are expected to be delivered within our stated policy. There are two items that fall into Forced Prioritization:
bug::vulnerabilitymust be delivered according to the stated SLO
bug::availabilitywith specific SLOs as well as
ci-decomposition::phase*that follow the stated
Any issues outside of these labels are to be prioritized using cross-functional prioritization. Auto-scheduling issues based on automation or triage policies are not forced prioritization. These issues can be renegotiated for milestone delivery and reassigned by the DRI.
Despite the high priority of velocity to our project and our company, there is one set of things we must prioritize over it: GitLab availability & security. Neither we, nor our customers, can run an Enterprise-grade service if we are willing to risk users' productivity and data.
Our hundreds of Engineers collectively make thousands of independent decisions each day that can impact GitLab.com and our users and customers there. They all need to keep availability and security in mind as we endeavor to be the most productive engineering organization in the world. We can only move as fast as GitLab.com is available and secured. Availability of self-managed GitLab instances is also extremely important to our success, and this needs to happen in partnership with our customers' admins (whereas we are the admins for GitLab.com).
For security, we prioritize it more highly by having strict SLAs around priorities labels with security issues. This shows a security first mindset as these issues take precedence in a given timeframe.
All team members are expected to follow documented processes. We develop and document processes (for example: Feature flag usage, Code Review Guidelines) through constant iteration and refinement. We find opportunities for improvement through analyzing metrics to identify trends, hosting retrospectives (e.g. Group Retrospectives, Iteration Retrospectives), performing Root Cause Analyses, and receiving feedback from team members. Team members are encouraged to identify opportunities to improve our processes and propose solutions, examples of this could be an MR or and issue describing these opportunities.
Following established processes ensures that we learn from our mistakes and efficiently deliver high-quality, highly performant, and secure software. We prefer to fail fast and learn quickly. Team members who are not software developers benefit from working more efficiently to deliver their results as well. Regardless of your discipline, processes are the guard rails that ensure we produce desirable and predictable results.
Everyone can contribute by proposing new processes and improving upon existing processes.
It is important to remember that quality is everyone's responsibility. Everything you merge to master should be production ready. Familiarize yourself with the definition of done.
Our releases page describes our two main release channels:
As the first of these is a monthly release, it's tempting to try to rush to get something in to a monthly self-managed release. However, this is an anti-pattern. Most issues don't have strict due dates. Those that do are exceptions, and should be treated as such.
Due date pressure logically leads to a few outcomes:
Only the last two outcomes are acceptable as a general rule. Missing a 'due date' in the form of an assigned milestone is often OK as we put velocity above predictability, and missing the monthly self-managed release does not prevent code from reaching GitLab.com.
For these reasons, and others, we intentionally do not define a specific date for code to be merged in order to reach a self-managed monthly release. The earlier it is merged, the better. This also means that:
If it is essential that a merge request make it in a particular release, this must be communicated well in advance to the engineer and any reviewers, to ensure they're able to make that commitment. If a severe bug needs to be fixed with short notice, it is better to revert the change that introduced it than to rush, or even to delay the release until the fix is ready.
In general, there is no need to change any behavior close to the self-managed release.
We dogfood everything. Based on our product principles, it is the Engineering division's responsibility to dogfood features or do the required discovery work to provide feedback to Product. It is Product's responsibility to prioritize improvements or rebuild functionality in GitLab.
An easy antipattern to fall into is to resolve your problem outside of what the product offers. Dogfooding is not:
Follow the dogfooding process described in the Product Handbook when considering building a tool outside of GitLab.
We need to maintain code quality and standards. It's very important that you are familiar with the Development Guides in general, and the ones that relates to your group in particular:
Please remember that the only way to make code flexible is to make it as simple as possible:
A lot of programmers make the mistake of thinking the way you make code flexible is by predicting as many future uses as possible, but this paradoxically leads to *less* flexible code.— Nearby Cats (@BaseCase) January 16, 2019
The only way to achieve flexibility is to make things as simple and easy to change as you can.
Part of our engineering culture is to keep shipping so users and customers see significant new value added to GitLab.com or their self-managed instance. To support rapid development, we choose pragmatically the right technology. As each view is unique, we should equally respect our HAML and Vue codebase and make an educated choice per view as to which framework will enable the most consistency and maintainability.
It’s important to keep in mind that when building complex applications, there are many factors to consider such as the fully planned feature to avoid situations where we build an MVC in HAML only to later need to re-write it in Vue due to growing complexity.
To promote visual consistency and an accessible UI, we should always aim to use simple and reusable UI components provided by the GitLab UI component library both in Vue and HAML views. We implement GitLab UI components based on our Pajamas design system and currently these are mostly in Vue, however, we provide adapters that allow us to use a few simple components in HAML as well.
If a GitLab UI component is not available on HAML due to its intrinsic complexity, this is a sign that you should implement your feature using Vue instead.
A complex component denotes a type of component that cannot be used easily in our HAML files. This might be due to in-built state management, CSS or dynamic behaviour that rapidly becomes a maintainability burden inside HAML. An example of such a component would be our Table component.