Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Stage Direction - Package

Package

Letter from the editor

TLDR; For the next several milestones we will focus the effort of the Package group on the container registry and dependency proxy.

To the GitLab Community and customers,

First, let me say welcome to Hugo Ortiz, a new member of the Package Group. Hugo is a new backend engineer and will help us to deliver on GitLab's vision to be your single source of truth for packages and dependencies.

In my last update, I discussed the need to focus on reliability. To do so, we delayed breaking ground on virtual registries. So, I thought I would use this update to discuss the outcome of our focused attention and to provide an update on virtual registries, what's next and why.

Amplifying SaaS reliability

In chess, there is a saying "When you don't know what to do, consider improving your worst piece." As a player, that's often difficult to do since your tendency is to focus on advancing your primary tactics. I think the same principles apply to product development. It can be hard to identify your 'weakest piece' or to know when to prioritize addressing it. That's why as a team we've incorporated error budgets and risk maps into the product prioritization and development process. Doing so has allowed us to identify, prioritize, and release several significant performance issuesin the past several months, including improving the performance of the Maven repository by 500% and reducing LCP for the container registry by 97%.

Using error budgets, risk maps and improved processes have allowed us to find the right balance between innovation and reliability. In other words, although we will continue to identify and prioritize reliability, it will not be our singular focus moving forward.

For the next several months, the Package group will focus on testing a change to the container registry that will add support for online garbage collection and unblock several other features.

Finally, I know many of you are waiting for virtual registries prior to consolidating on GitLab for package management. I'm excited to say that starting in milestone 14.3, we will break ground by updating the dependency proxy to work with any container registry, not just Docker Hub. This will help you to reduce your reliance on external dependencies and make your builds faster. We will also add virtual registry support for npm. We'll start with proxying and caching packages from the public npm registry and then expand the feature to include any remote registry.

Thank you for reading! Tim

What's next and why

Every day the container registry is used to publish and install hundreds of thousands of images. With the implementation of online garbage collection complete, we will focus on testing, deployment, and migration to the new registry. You can follow that work in GitLab-#5523.

For the Package Registry, we have a few key issues planned. I'm very excited to say that in milestone 14.3 we plan to roll out support for Debian packages.

We also have two key improvements planned for npm. GitLab-#338483 will make a significant performance improvement to ensure that dependencies are downloaded fast and reliably. GitLab-#330929 will extract package.json metadata so that it can be made available in the API and the user interface.

For the Dependency proxy, we have several bugs and usability issues planned:

After that, we have several exciting features planned. The epic GitLab-#6061 proposes updating the Dependency Proxy to work with any external container registry, not just Docker Hub. This will help you to reduce build times and reduce your external dependencies.

gitlab-#231239 is the first step in unifying the Dependency Proxy and the Package Registry. The issue proposes turning the npm request forwarding feature from a simple forwarding mechanism into a proxy. This will allow GitLab to store and present package metadata and will eventually lead to gitlab-#24123 which will add support for caching.

Goal

The goal of the Package Group is to build a product, that within three years, is our customer's single source of truth for storing and distributing images and packages.

Do customers want this?

Yes. As the PM for the Package stage, I hear regularly from customers and prospects that would like to migrate off of Jfrog's Artifactory. Their reasons for wanting to consolidate on GitLab are:

  1. Convenience (authentication, management, improved UX)
  2. Cost
  3. Lack of support (getting to meet with GitLab PMs is a big + for these folks)

Typically the needs of these customers can be predictably segmented by the size of their organization. For the sake of simplicity, let's classify their needs as enterprise and non-enterprise.

Non-enterprise organizations

Typically they’d like to know if we support format x and if not when will we support it. The formats that we don’t support that we hear most often are:

(All of the above will be useful for ~Dogfooding as well)

If we support their requested format, these customers are often able to consolidate.

They are typically blocked by issues and bugs that are fairly straightforward to address. They are most likely to engage in issues or on Twitter. They may use a single project as their universal registry. They are concerned about inconsistent token support, storage costs, and management.

Enterprise organizations

We often hear from large, enterprise organizations that they'd like to consolidate on GitLab and move away from their existing vendor. But, our advice to these organizations is that they wait until the GitLab Package product matures. When comparing GitLab to Artifactory or Sonatype, there are several key missing features that must be considered.

Categories

If you'd like to learn more, the below information contains a summary, competitive info, and other helpful content for each product category associated with the Package stage.

Container Registry

The GitLab Container Registry is a secure and private registry for Docker images. Built on open source software and completely integrated within GitLab. Use GitLab CI/CD to create and publish branch/release specific images. Use the GitLab API to manage the registry across groups and projects. Use the user interface to discover and manage your team's images. GitLab will provide a Lovable container registry experience by being the single location for the entire DevOps Lifecycle, not just a portion of it. We will provide many of the features expected of a container registry, but without the weight and complexity of a single-point solution.

Competitive Landscape

Open source container registries such as Docker Hub and Red Hat's Quay offer users a single location to build, analyze, and distribute their container images. Docker Hub recently introduced rate limits for pulls from Docker Hub.

The primary reason people don’t use DockerHub is that they need a private registry and one that lives alongside their source code and pipelines. They like to be able to use pre-defined environment variables for cataloging and discovering images. Often DockerHub is used as a base image for a test, but if you are building an app, you will likely customize an image to fit your application and save it GitLab's private registry alongside your source code.

Artifactory and Nexus both offer support for building and deploying Docker images. Artifactory offers their container registry as part of their community edition as well.

Artifactory integrates with several different CI servers through dedicated plug-ins, including Jenkins and Azure DevOps, but does not yet support GitLab. However, you can still connect to your Artifactory repository from GitLab CI. Here is an example of how to deploy Maven projects to Artifactory with GitLab CI/CD.

GitHub has recently made their container registry generally available. Currently, the GitHub Container Registry only supports Docker image formats. During the beta, storage and bandwidth are free. After the beta, you can expect each tier to come with an included amount of storage and data transfers. Once you pass those limits, you will pay $0.25 USD per GB of storage and $0.50 USD per GB of data transfer. One concern worth raising is that we don't see a way to programmatically delete images. Given the cost of storing images, this could be a concern for organizations that heavily use GitHub's registry. Another limitation is that they only support authentication using your Personal Access Token. This is not ideal for organizations that would like to avoid using individual-level credentials. With the GitLab Container Registry, you may use a PAT, Deploy, or Job token to authenticate to the registry.

There are several nice features that they've included. One nice feature is that you can publish images to your namespace or your user account. We would like to create that same functionality via gitlab-#241027. Also, their user interface includes helpful metadata, such as how often it's downloaded and a readme.

Amazon offers a fully-featured registry and plans to add support for highly available, publicly hosted images.

Google Cloud offers a container registry that allows you to integrate with any CI/CD platform. The registry is free, although they do charge for storage and network egress. Google's registry includes container scanning and high availability.

JetBrains offers a container registry that allows you to add a project repository and publish images and tags using the Docker client or your JetBrains project. Although they do not currently have any documentation for administrative features, such as cleanup policies or garbage collection.

Digital Ocean offers a container registry that allows you store and configure private Docker images. In addition, they support global load balancing and caching in multiple regions. One potential drawback is that each Digital Ocean account is limited to 1 registry, whereas with GitLab each Project can have its own registry.

Package Registry

Our goal is for you to rely on GitLab as a universal package manager, so that you can reduce costs and drive operational efficiencies. The backbone of this category is your ability to easily publish and install packages, no matter where they are hosted.

You can view the list of supported and planned formats in our documentation here.

Supported formats

The below table lists our supported and most frequently requested package manager formats. Artifactory and Nexus both support a longer list of formats, but we have not heard many requests from our customers for these formats. If you'd like to suggest we consider a new format, please open an issue here.

  GitLab Artifactory Nexus GitHub Azure Artifacts AWS CodeArtifact Google Artifact Registry
Composer ☑️ ✔️ ✔️️️️ - - - -
Conan ☑️ ✔️ ☑️ - - - -
Debian ☑️ ✔️ ✔️ - - - -
Gradle ✔️ ✔️ ✔️ ️✔️ ️ ✔️ ✔️ ✔️
Helm ☑️ ✔️ ✔️ ️☑️ ️ ☑️ ☑️ ☑️
Maven ✔️ ✔️ ✔️ ️✔️ ️ ✔️ ✔️ ✔️
npm ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
NuGet ✔️ ✔️ ✔️ ✔️ ✔️ - -
PyPI ✔️ ✔️ ✔️ - ✔️ ✔️ -
RPM - ✔️ ✔️ - - - -
Ruby gems ☑️ ✔️ ✔️ ✔️ - - -

☑️ indicates support is through community plugin or beta feature

Interested in contributing a new format? Please check out our suggested contributions.

Competitive Landscape
Universal package management tools

Artifactory and Nexus are the two leading universal package manager applications on the market. They both offer products that support the most common formats and additional security and compliance features. A critical gap between those two products and GitLab's Package offering is the ability to easily connect to and group external, remote registries. To date, GitLab has been focused on delivering Project and Group-level private package registries for the most commonly used formats. We plan on bridging this gap by expanding the Dependency Proxy to support remote and virtual registries.

Cloud providers

Azure and AWS both offer support for hosted and remote registries for a limited amount of formats. Google has a product called Artifact Registry that is in Alpha and supports Java and Node. All of the cloud providers charge for Cloud storage and network egress.

DevOps Platforms

GitHub offers a package management solution as well. They offer project-level package registries for a variety of formats. However, looking at GitHub's roadmap, they've deprioritized many features

GitHub charges for storage and network transfers. GitHub does a nice job with search and reporting usage data on how many times a given package has been downloaded. They do not have anything on their roadmap about supporting remote and virtual registries, which would allow them to group registries behind a single URL and allow them to act as a universal package manager, like Artifactory or Nexus or GitLab.

JetBrains offers a Package Registry with support for npm and more planned formats. They have an ambitious and exciting roadmap for 2021, including adding support for Maven, Python and PHP. It's interesting to see that they'd like to support signing of packages and virtual registries, two features we are interested in adding at Gitlab.

Dependency Proxy

Many projects depend on a growing number of packages that must be fetched from external sources with each build. This slows down build times and introduces availability issues into the supply chain. ​​For organizations, this presents a critical problem. By providing a mechanism for storing and accessing external packages, we enable faster and more reliable builds.

Our vision for the Dependency Proxy is to provide a product that will provide fast, reliable access to all of your dependencies, whether they are hosted on GitLab or any other vendor. In addition, the Dependency Proxy will work hand-in-hand with the planned Dependency Firewall, which will help to prevent any unknown or unverified providers from introducing potential security vulnerabilities.

Currently the Dependency Proxy allows you to proxy and cache images from DockerHub. This can help you to speed up your pipelines and reduce your external dependencies. However this is only the first step. In the coming milestones, we will expand the Dependency Proxy from a single, hardcoded endpoint, to the place where you can setup and manage all of your registries (both packages and images) in one place.

There are a few important terms that are worth sharing:

Usecases listed
  1. Provide a single method of reaching upstream package management utilities, in the event they are not otherwise reachable.
  2. Cache images and packages for faster build times.
  3. Track which dependencies are utilized by which projects when pulled through the proxy.
  4. Audit logs in order to find out exactly what happened and with what code.
  5. Operate when fully cut off from the internet with local dependencies.
User flow

The below diagram demonstrates how you can use the Dependency Proxy to create a virtual registry which will look for and fetch dependencies from your hosted and remote registries. This will allow you to download all of your dependencies with a single URL, instead of having to remember which packages are hosted where.

Diagram Note: The above diagram shows all of your dependencies being resolved through the Dependency Proxy. Usage of this feature is not required. You can easily use your hosted and remote registries without grouping them in a virtual registry.

Competitive landscape

Artifactory is the leader in this category. They offer 'remote repositories' which serve as a caching repository for various package manager integrations. Utilizing the command line, API or a user interface, a user may create policies and control caching and proxying behavior. A Docker image or package may be requested from a remote repository on demand and if no content is available it will be fetched and cached according to the user's policies. In addition, they offer support for many of major packaging formats in use today. For storage optimization, they offer check-sum based storage, deduplication, copying, moving and deletion of files.

The below tables outline our current capabilities compared to JFrog's Artifactory and Sonatype's Nexus.

Container Registry GitLab Artifactory Nexus
Local registries ✔️ ✔️ ✔️
Remote registries Partial* ✔️ ✔️
Virtual registries Coming soon ✔️ ✔️

*The Dependency Proxy currently supports one hardcoded remote registry, which allows you to proxy and cache container images hosted on DockerHub.

Package Registry GitLab Artifactory Nexus
Local registries ✔️ ✔️ ✔️
Remote registries Partial* ✔️ ✔️
Virtual registries Coming soon ✔️ ✔️

*By default, when an NPM package is not found in the GitLab NPM Registry, the request will be forwarded to npmjs.com. Check out this speed-run to see how it works.

Dependency Firewall

Many projects depend on packages that may come from unknown or unverified providers, introducing potential security vulnerabilities. GitLab already provides dependency scanning across a variety of languages to alert users of any known security vulnerabilities, but we currently do not allow organizations to prevent those vulnerabilities from being downloaded to begin with.

The goal of this category will be to leverage the dependency proxy, which proxies and caches dependencies, to give more control and visibility to security and compliance teams. We will do this by allowing users to create and maintain an approved/banned list of dependencies, providing more insight into the usage and impact of external dependencies and by ensuring the GitLab Security Dashboard is the single source of truth for all security related issues.

By preventing the introduction of security vulnerabilities further upstream, organizations can let their development teams work faster and more efficiently.

Use cases
Competitive landscape

JFrog utilizes a combination of their Bintray and XRay products, to proxy, cache and screen dependencies. They also provide dependency graphs across multiple languages and centralized dashboards for the review and remediation of vulnerabilities. It is a mature product, that is generally well received by users. JFrog recently acquired Vdoo to and plans to update XRay to to include Vdoo’s extensive data and improved scanning across multiple dimensions, including configuration and applicability scanning.

GitHub's new package registry does a really nice job of creating visibility into the dependency graph for a given package, but they do not give users the ability to control which packages are used in a given group/project.

Helm Chart repository

Users or organizations that deploy complex pieces of software towards Kubernetes managed environments depend on a standardized way to automate provisioning those external environments. Helm is the package manager for Kubernetes and helps users define, manage, install, upgrade, and rollback even the most complex Kubernetes application. Helm uses a package format called Charts to describe a set of Kubernetes resources.

Helm charts are easy to create, version, share and publish right within GitLab.

Usecases listed
  1. Public and private repositories for Helm charts
  2. Fine-grained access control
  3. Standardized workflow to version control and publish charts making use of GitLab's other services
Competitive Landscape

An important distinction between competitive products is that Helm 3 supports using an OCI container registry to as a Helm repository. However, this requires that you use Helm 3 in experimental mode, which may introduce other risks.

* requires using Helm 3 in experimental mode.

Git LFS

Git LFS (Large File Storage) is a Git extension, which reduces the impact of large files in your repository by downloading the relevant versions of them lazily. Specifically, large files are downloaded during the checkout process rather than during cloning or fetching..

This page is maintained by the Product Manager for Package, Tim Rizzi (E-mail), however the prioritization, design and implementation of features and bugs is owned by the Create:Source Code Group.

Use cases
  1. Version large files—even those as large as a couple GB in size—with Git.
  2. Automatically detect LFS-tracked files and clone them via HTTP
  3. Download less data. This means faster cloning and fetching from repositories that deal with large files.
  4. Host more in your Git repositories. External file storage makes it easy to keep your repository at a manageable size.
Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license