TLDR; Upgrade to the new container registry (Beta) to unlock online garbage collection. This issue has all the information you need to get started.
When I joined the GitLab Package stage, the container registry already existed and was a critical feature for GitLab and GitLab's customers. But some fundamental problems needed to be addressed.
- The user interface was unusable due to missing functionality like sorting, filtering, and deleting container images.
- Operations that required listing the tags associated with an image were not performant at scale.
- There was no good way to delete container images programmatically.
- We had very little insight into user adoption.
- The storage costs for GitLab.com were tremendously high.
Of course, all of the above issues were related. The container registry was using a fork of the Distribution project, which had a lot of performance and usability issues when operating at the GitLab.com scale.
As a team, we decided that the first problem to tackle was the ever-growing cost of storage for GitLab.com. The legacy registry did not support online garbage collection. After calculating that it would take an absurd amount of downtime to run garbage collection in offline mode, we moved on to our next idea: optimize the existing offline garbage collector.
Optimizing the container registry code
We optimized the code for Google Cloud Storage (GCS) and Amazon S3, and saw a 90% reduction in the time it takes to run garbage collection. This benefited many GitLab customers with container registries smaller than 100 TB. Even with the performance improvements, we estimated a staggering 64 days to run garbage collection for GitLab.com.
In the end, we took the Distribution project as far as we could. We needed a container registry that supported more advanced use cases than push and pull. And we needed to drastically reduce the operating costs to make the feature sustainable for Free tier users. We decided to fork the Distribution project and build the next-generation container registry.
Solving the online garbage collection problem
Next, we dove head first into solving the online garbage collection problem for GitLab.com. Faced with petabytes of scale and the requirement to maintain our error budgets, we designed and implemented an online migration of GitLab.com with zero degradation in service.
We completed the migration 12 months ago. The results?
- Garbage collection deletes terabytes of data from GitLab.com each day.
- Improved performance and reliability.
- We removed a lot of data from object storage and saved a lot of money.
Migrating to the next-generation container registry
Now we want to help GitLab self-managed customers migrate to the next-generation container registry. By upgrading, you will unlock support for online garbage collection, which can save you costly downtime or escalating storage costs. You can also expect to see performance and reliability improvements for the container registry API and UI.
Another benefit is that you get to give early feedback to the team on what's working well or not so well for you. This feedback is valuable for GitLab and your organization because we will ensure that the next set of features being developed meets your needs.
The road ahead
New features are coming. Now that the registry leverages a metadata database for efficient queries, we can deliver significant UI and UX improvements that were impossible before. In 2024, we plan to add support for the below features.
- Making the container registry GA for self-managed customers
- Improved sorting and filtering with the container registry
- Improved UI for manifest/multi-arch container images
- Improved UI for container image attestation and signing
- Improved UI for storing Helm charts in the registry
- Add support for protected repositories and immutable tags
Note: While the registry is in Beta
for self-managed, we will be adding new features to GitLab.com that will not be immediately available to self-managed until the registry is generally available. This is to ensure that we focus on migrating as many customers as possible as efficiently as possible.
Get started today
We want to enable those features for self-managed customers, but we need your help. Please consider migrating to the next-generation container registry today. The best place to start is the feedback issue, which has links to documentation, helpful tips, and the attention of the Package team here at GitLab.
Disclaimer: This blog contains information related to upcoming products, features, and functionality. It is important to note that the information in this blog post is for informational purposes only. Please do not rely on this information for purchasing or planning purposes. As with all projects, the items mentioned in this blog and linked pages are subject to change or delay. The development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab.