The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
Last updated: 2023-03-10
Please reach out to Christina Lohr, Senior Product Manager for Tenant Scale (Email) if you'd like to provide feedback or ask any questions related to this product category.
This strategy is a work in progress, and everyone can contribute. Please comment and contribute in the linked issues and epics on this page. Sharing your feedback directly on GitLab.com is the best way to contribute to our strategy and vision.
GitLab.com, our SaaS offering, is growing rapidly. This growth requires that the underlying infrastructure components are able to scale to accommodate additional users. The GitLab.com production architecture highlights the different components and the reference architectures provide an overview for self-managed customers.
Scaling GitLab requires different strategies for the individual components. For example, web application nodes are stateless and can be scaled relatively easily by creating more individual servers. Stateful components are much harder to scale. As a single solution for the entire DevOps lifecycle, GitLab depends on a single data-store which serves as a the single source of truth of data. For GitLab, this data store is mostly a single PostgreSQL database. Over time, GitLab has added additional databases for specific features, such as Gitaly Cluster, Geo and the Container Registry. Adding new data stores requires approval from the CEO and all engineering fellows to avoid unnecessary proliferation of data stores.
GitLab's database on GitLab.com is provisioned as a single logical database with a primary server and several physical read-only replicas. Given the continuing growth of GitLab.com, this PostgreSQL database needs to handle more and more transactions per second. Reading data can be accelerated by provisioning additional replicas, writing new data, however, can't be easily scaled in the same way. There can only be one primary server and all writes have to go through it. In order to address this problem there are several possible solutions:
Buy more capable hardware - Bigger servers can handle more transactions. This is generally referred to as vertical scaling
Define a horizontal scaling strategy
GitLab.com is approaching a point where buying bigger servers is no longer easily possible. For this reason, the Database Scalability Working Group was founded to define and implement strategies to scale GitLab's database.
The Tenant Scale group is concerned with delivering application changes that allow GitLab and GitLab.com to scale to millions of users and implement the strategies defined in the Database Scalability Working Group.
In the future we expect that GitLab.com can
These outcomes are also defined as the exit criteria of the Database Scalability Working Group.
Following a number of proof of concept implementations, the Tenant Scale group is focusing on decomposing GitLab's database. This approach relies on moving all the tables associated with a feature into a separate logical database. We chose this approach because it is iterative and can be implemented in a shorter amount of time than sharding. When decomposing a certain feature, the team can focus on a smaller subset of tables while still solving some problems that are relevant to later strategies. We also gain confidence in operating multiple databases.
The Tenant Scale group will focus on CI tables first. The reason for choosing CI tables is that they account for ~36% of the overall DB size and roughly 50% of writes. Decomposing these CI tables would effectively allow us to reduce writes on the main database by 50% because this additional logical database can be moved to a physically different database cluster. Decomposition is also sometimes referred to as vertical sharding.
The Tenant Scale group is focusing on decomposing all tables (~50) that are connected to our CI feature. We've identified this feature because roughly 50% of writes can be attributed to CI. We are working to resolve all issues with the CI tables that are not easily fanned out to feature teams, mostly because the implementation is too technical and potentially dependent on other work in the group.
To make decomposition a success, we need great support within the application to support many databases. Today, GitLab uses only one database (main
) - when CI tables are decomposed the application needs to manage another database (ci
).
This means that the application needs to be able to handle running database related tasks, such as migrations (normal, post, and background) on many databases and we need to decide on approaches for these generic database features.
With two databases, we also need to handle cross-database modifications. For example, cascading deletes won't work and foreign keys between tables located on different databases and we need to find alternative solutions via loose foreign keys.
In order to benefit from Decomposition and realize the scalability improvements, we need to deploy Decomposition in Staging and Production. Given the scope of this change, we've defined an iterative rollout strategy for decomposition that allows us deploy changes in stages. The latest timeline can be seen here.
In Q1FY23 we expect GitLab.com will run on multiple databases that are decomposed by feature. We expect that at least two independent databases exist: main
and ci
. This will provide significant headroom and will allow the Tenant Scale group to transition towards validating proposals for scaling GitLab even further.
All self-managed customers will have transitioned to running decomposed logical databases within a single database cluster. All migrations will have completed with minimal interruption and all self-managed features, such as backups and restore, will work seamlessly when running multiple databases.
Decomposition is only a first step to unlocking further scalability for GitLab. Decomposition is a vertical scaling strategy and it can only deliver a limited amount of scalability. In order to support further growth GitLab needs a long term horizontal scalability strategy. A Cells architecture allows for horizontal scalability and has other possible benefits, such as improved service availability. This architecture creates many mostly isolated GitLab instances, called Cells, that include all required services (database, web, Redis, Gitaly, Runners, Sidekiq etc.). The number of Cells can grow alongside the growth of the business.
Sharding provides an alternative but is really hard as a universal solution. We'll end up requiring a Cells approach either way. By transitioning from Decomposition to Cells, we don’t need to find a sharding solution and avoid a “worst of all worlds” scenario where we have Decomposition, Sharding and Cells.
We currently don't plan to implement any scalability solutions for GitLab.com that would negatively impact our self-managed customers. We want all customers to benefit from further scalability
Federation and a SaaS-to-Self-Managed connector are out of scope. The Tenant Scale group is focused on solving the scalability challenge for the largest GitLab instances, rather than connecting disparate and potentially untrusted systems. If you'd like to follow these other feature requests, see:
We do plan to support a Cells architecture for self-managed deployments, which could address a narrow set of self-managed use cases which previously required independent instances, like data residency.
This is a list of scaling solutions that others have implemented: