The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
Last updated: 2023-07-21
|Cell||Direction Page||A Cell is a set of infrastructure components that contains multiple top-level groups that belong to different Organizations.||Not Applicable|
|Organization||Direction Page||An Organization is the umbrella for one or multiple top-level groups.||Planned|
|Groups & Projects||Direction Page||Groups represent collections of users or projects.||Complete|
|User Profile||Directon Page||Users represent individuals using GitLab.||Not Applicable|
The Tenant Scale direction page belongs to the Data Stores stage within the Enablement section, and is maintained by Christina Lohr. The Tenant Scale Engineering team and stable counterparts can be found on the Engineering team page.
This strategy is a work in progress, and everyone can contribute. Please comment and contribute in the linked issues and epics. Sharing your feedback directly on GitLab.com is the best way to contribute to our strategy and vision.
If you would like support from the Tenant Scale team, please see the team's page detailing how to contact the Tenant Scale team.
GitLab.com, our SaaS offering, is growing rapidly. This growth requires that the underlying infrastructure components are able to scale to accommodate additional users. The GitLab.com production architecture highlights the different components and the reference architectures provide an overview for self-managed customers.
Scaling GitLab requires different strategies for the individual components. For example, web application nodes are stateless and can be scaled relatively easily by creating more individual servers. Stateful components are much harder to scale. As a single solution for the entire DevOps lifecycle, GitLab depends on a single data-store which serves as a the single source of truth of data. For GitLab, this data store is mostly a single PostgreSQL database. Over time, GitLab has added additional databases for specific features, such as Gitaly Cluster, Geo and the Container Registry. Adding new data stores requires approval from the CEO and all engineering fellows to avoid unnecessary proliferation of data stores.
GitLab's database on GitLab.com is provisioned as a single logical database with a primary server and several physical read-only replicas. Given the continuing growth of GitLab.com, this PostgreSQL database needs to handle more and more transactions per second. Reading data can be accelerated by provisioning additional replicas. Writing new data, however, can't be easily scaled in the same way. There can only be one primary server and all writes have to go through it. In order to address this problem there are several possible solutions:
GitLab.com is approaching a point where buying bigger servers is no longer easily possible. For this reason, the Database Scalability Working Group was founded to define and implement strategies to scale GitLab's database.
The Tenant Scale group is concerned with delivering application changes that allow GitLab and GitLab.com to scale to millions of users and implement the strategies defined in the Database Scalability Working Group.
In the future we expect that GitLab.com:
These outcomes are also defined as the exit criteria of the Database Scalability Working Group.
Following the successful rollout of a separate database for
ci for GitLab.com, we are now working on bringing
the same solution to self-managed installations.
Decomposition is only a first step to unlocking further scalability for GitLab. Decomposition is a vertical scaling strategy and it can only deliver a limited amount of scalability. In order to support further growth GitLab needs a long term horizontal scalability strategy. A Cells architecture allows for horizontal scalability and has other possible benefits, such as improved service availability. This architecture creates many mostly isolated GitLab instances, called Cells, that include all required services (database, web, Redis, Gitaly, Runners, Sidekiq etc.). The number of Cells can grow alongside the growth of the business.
Sharding provides an alternative but is really hard as a universal solution. We'll end up requiring a Cells approach either way. By transitioning from Decomposition to Cells, we don’t need to find a sharding solution and avoid a “worst of all worlds” scenario where we have Decomposition, Sharding and Cells.
In the 17.0 release, two databases will be the only supported configuration for all self-managed customers, meaning they will have transitioned to running decomposed logical databases within a single database cluster. All migrations will have completed with minimal interruption and all self-managed features, such as backups and restore, will work seamlessly when running multiple databases.
We plan to roll out the Organization MVC over the next year.
Our current focus areas and engineering investment are broken down by category below, percentages represent how much engineering time on average is allocated to each category in a milestone.
Or current focus is on building out the concepts of Cells.
We are also working on enabling decomposition for all self-managed customers.
Or current focus is on building out the concepts of Organizations.
Our current focus is on consolidating groups and projects into one generic namespace container. The highest priority at the moment is to migrate basic project functionality to namespaces. This will enable us to make project functionality available at the group level. Building onto the migration efforts described above, we are looking to provide functionality at the group level that was previously only available at the project level. First iterations of this effort will be to make the archiving and starring functionality available at the group level, which are some of the most requested features from our customers.
Another big pain point that our SaaS users have is the ability to control their users and groups, which exists in the admin panel for self-managed users. In order to overcome this and create feature parity, we will migrate administrative capabilities to the organization, group and project level so that group/project owners have more control.
To ensure we can make progress in the other categories, we are currently deprioritizing work on the User Profile. We are supporting but not actively working on improvements.
In order to benefit from Decomposition and realize the scalability improvements, we deployed Decomposition to GitLab.com.
We currently don't plan to implement any scalability solutions for GitLab.com that would negatively impact our self-managed customers. We want all customers to benefit from further scalability.
Federation and a SaaS-to-Self-Managed connector are out of scope. The Tenant Scale group is focused on solving the scalability challenge for the largest GitLab instances, rather than connecting disparate and potentially untrusted systems. If you'd like to follow these other feature requests, see:
We do plan to support a Cells architecture for self-managed deployments, which could address a narrow set of self-managed use cases which previously required independent instances, like data residency.
This is a list of scaling solutions that others have implemented: