Published on: June 24, 2025
9 min read
Discover what the bundle URI Git feature is, how it is integrated into Gitaly, configuration best practices, and how GitLab users can benefit from it.
Gitaly plays a vital role in the GitLab ecosystem — it is the server component that handles all Git operations. Every push and pull made to/from a repository is handled by Gitaly, which has direct access to the disk where the actual repositories are stored. As a result, when Gitaly is under heavy load, some operations like CI/CD pipelines and browsing a repository in the GitLab UI can become quite slow. This is particularly true when serving clones and fetches for large and busy monorepos, which can consume large amounts of CPU and memory.
Bundle URI takes significant load off of Gitaly servers during clones by allowing Git to pre-download a bundled repository from object storage before calling the Gitaly servers to fetch the remaining objects.
Here is a graph that shows the difference between clones without and with bundle URI.
This graph shows the results of a small test we ran on an isolated GitLab installation, with Gitaly running on a machine with 2 CPUs. We wanted to test bundle URI with a large repository, so we pushed the GitLab repository to the instance. We also generated a bundle beforehand.
The big CPU spike is from when we performed a single clone of the GitLab repository with bundle URI disabled. It's quite noticeable. A little later, we turned on bundle URI and launched three concurrent clones of the GitLab repository. Sure enough, turning on bundle URI provides massive performance gain. We can't even distinguish the CPU usage of the three clones from normal usage.
To enable bundle URI on your GitLab installation, there are a couple of things you need to configure.
Bundles need to be stored somewhere. The ideal place is in a cloud storage bucket. Gitaly uses the gocloud.dev library to read and write from cloud storage. Any cloud storage solution supported by this library can be used. Once you have a cloud bucket URL, you can add it in the Gitaly configuration here:
[bundle_uri]
go_cloud_url = "<bucket-uri>"
It must be noted that Gitaly does not manage the lifecycle of the bundles stored in the bucket. To avoid cost issues, object lifecycle policies must be enabled on the bucket in order to delete unused or old objects.
There are two feature flags to enable:
gitaly_bundle_generation
enables auto-generation of bundles.
gitaly_bundle_uri
makes Gitaly advertise bundle URIs when they are available (either manually created or auto-generated) and allows the user to manually generate bundles.
These feature flags can be enabled at-large on a GitLab installation, or per repository. See the documentation on how to enable a GitLab feature behind a feature flag.
Gitaly offers two ways for users to use bundle URI: a manual way and an auto-generated way.
It is possible to create a bundle manually by connecting over SSH with the Gitaly node that stores the repository you want to create a bundle for, and run the following command:
sudo -u git -- /opt/gitlab/embedded/bin/gitaly bundle-uri
--config=<config-file>
--storage=<storage-name>
--repository=<relative-path>
This command will create a bundle for the given repository and store it into the bucket configured above. When a subsequent git clone
request will reach Gitaly for the same repository, the bundle URI mechanism described above will come into play.
Gitaly can also generate bundles automatically, using a heuristic to determine if it is currently handling frequent clones for the same repository.
The current heuristic keeps track of the number of times a git fetch
request is issued for each repository. If the number of requests reaches a certain threshold
in a given time interval
, a bundle is automatically generated. Gitaly also keeps track of the last time it generated a bundle for a repository. When a new bundle should be regenerated, based on the threshold
and interval
, Gitaly looks at the last time a bundle was generated for the given repository. It will only generate a new bundle if the existing bundle is older than some maxBundleAge
configuration. The old bundle is overwritten. There can only be one bundle per repository in cloud storage.
When a bundle exists for a repository, it can be used by the git clone
command.
To clone a repository from your terminal, make sure your Git configuration enables bundle URI. The configuration can be set like so:
git config --global transfer.bundleuri true
To verify that bundle URI is used during a clone, you can run the git clone
command with GIT_TRACE=1
and see if your bundle is being downloaded:
➜ GIT_TRACE=1 git clone https://gitlab.com/gitlab-org/gitaly
...
14:31:42.374912 run-command.c:667 trace: run_command: git-remote-https '<bundle-uri>'
...
One scenario where using bundle URI would be beneficial is during a CI/CD pipeline, where each job needs a copy of the repository in order to run. Cloning a repository during a CI/CD pipeline is the same as cloning a repository from your terminal, except that the Git client in this case is the GitLab Runner. Thus, we need to configure the GitLab Runner in such a way that it can use bundle URI.
1. Update the helper-image
The first thing to do to configure the GitLab Runner is to overwrite the helper-image that your GitLab Runner instances use. The helper-image
is the image that is used to run the process of cloning a repository before the job starts. To use bundle URI, the image needs the following:
Git Version 2.49.0 or later
GitLab Runner helper
Version 18.1.0 or later
The helper-images can be found here. Select an image that corresponds to the OS distribution and the architecture you use for your GitLab Runner instances, and verify that the image satisfies the requirements.
At the time of writing, the alpine-edge-<arch>-v18.1.0*
tag meets all requirements.
You can validate the image meets all requirements with:
docker run -it <image:tag>
$ git version ## must be 2.49.0 or newer
$ gitlab-runner-helper -v ## must be 18.0 or newer
If you do not find an image that meets the requirements, you can also use the helper-image as a base image and install the requirements yourself in a custom-built image that you can host on GitLab Container Registry.
Once you have found the image you need, you must configure your GitLab Runner instances to use it by updating your config.toml
file:
[[runners]]
(...)
executor = "docker"
[runners.docker]
(...)
helper_image = "image:tag" ## <-- put the image name and tag here
Once the configuration is changed, you must restart the runners for the new configuration to take effect.
2. Turn on the feature flag
Next, you must enable the FF_USE_GIT_NATIVE_CLONE
GitLab Runner feature flags in your .gitlab-ci.yml
file. To do that, simply add it as a variable and set to true
:
variables:
FF_USE_GIT_NATIVE_CLONE: "true"
The GIT_STRATEGY
must also be set to clone
, as Git bundle URI only works with clone
commands.
When a user clones a repository with the git clone
command, a process called git-receive-pack
is launched on the client's machine. This process communicates with the remote repository's server (it can be over HTTP/S, SSH, etc.) and asks to start a git-upload-pack
process. Those two processes then exchange information using the Git protocol (it must be noted that bundle URI is only supported with Git protocol v2). The capabilities both processes support and the references and objects the client needs are among the information exchanged. Once the Git server has determined which objects to send to the client, it must package them into a packfile, which, depending on the size of the data it must process, can consume a good amount of resources.
Where does bundle URI fit into this interaction? If bundle URI is advertised as a capability from the upload-pack
process and the client supports bundle URI, the Git client will ask the server if it knows about any bundle URIs. The server sends those URIs back and the client downloads those bundles.
Here is a diagram that shows those interactions:
As such, Git bundle URI is a mechanism by which, during a git clone
, a Git server can advertise the URI of a bundle for the repository being cloned by the Git client. When that is the case, the Git client can clone the repository from the bundle and request from the Git server only the missing references or objects that were not part of the bundle. This mechanism really helps to alleviate pressure from the Git server.
GitLab also has a feature Pack-objects cache. This feature works slightly differently than bundle URI. When the server packs objects together into a so-called packfile, this feature will keep that file in the cache. When another client needs the same set of objects, it doesn't need to repack them, but it can just send the same packfile again.
The feature is only beneficial when many clients request the exact same set of objects. In a repository that is quick-changing, this feature might not give any improvements. With bundle URI, it doesn't matter if the bundle is slightly out-of-date because the client can request missing objects after downloading the bundle and apply those changes on top. Also bundle URI in Gitaly stores the bundles on external storage, which the Pack-objects Cache stores them on the Gitaly node, so using the latter doesn't reduce network and I/O load on the Gitaly server.
You can try the bundle URI feature in one of the following ways:
Download a free, 60-day trial version of GitLab Ultimate.
If you already run a self-hosted GitLab installation, upgrade to 18.1.
If you can't upgrade to 18.1 at this time, download GitLab to a local machine.