A visual guide to GitLab CI/CD caching

Matthieu Fronton ·
Sep 12, 2022 · 4 min read

If you've ever worked with GitLab CI/CD you may have needed, at some point, to use a cache to share content between jobs. The decentralized nature of GitLab CI/CD is a strength that can confuse the understanding of even the best of us when we want to connect wires all together. For instance, we need to know critical information such as the difference between artifacts and cache and where/how to place setups.

This visual guide will help with both challenges.

Cache vs. artifacts

The concepts may seem to overlap because they are about sharing content between jobs, but they actually are fundamentally different:

Here is a simple sentence to remember if you struggle between choosing cache or artifact:

Cache is here to speed up your job but it may not exist, so don't rely on it.

This article will focus on cache.

Initial setup

We'll go with a simple representation of the GitLab CI/CD pipelining model and ignore (for now) that the jobs can be executed on any runners and hosts. It will help get the basics.

Let's say you have:

Initial setup

Local cache: Docker volume

If you want a local cache between all your jobs running on the same runner, use the cache statement in your .gitlab-ci.yml:

default:
  cache:
    path:
      - relative/path/to/folder/*.ext
      - relative/path/to/another_folder/
      - relative/path/to/file

local / container / all branches / all jobs

Using the predefined variable CI_COMMIT_REF_NAME as the cache key, you can ensure the cache is tied to a specific branch:

default:
  cache:
    key: $CI_COMMIT_REF_NAME
    path:
      - relative/path/to/folder/*.ext
      - relative/path/to/another_folder/
      - relative/path/to/file

local / container / one branch / all jobs

Using the predefined variable CI_JOB_NAME as the cache key, you can ensure the cache is tied to a specific job:

local / container / all branch / one jobs

Local cache: Bind mount

If you don't want to use a volume for caching purposes (debugging purpose, cleanup disk space more easily, etc.), you can configure a bind mount for Docker volumes while registering the runner. With this setup, you do not need to set up the cache statement in your .gitlab-ci.yml:

#!/bin/bash

gitlab-runner register                             \
  --name="Bind-Mount Runner"                       \
  --docker-volumes="/host/path:/container/path:rw" \
...

local / one runners / one host / all branch / all jobs

In fact, this setup even allows you to share a cache between jobs running on the same host without requiring you to set up a distributed cache (which we'll talk about later):

#!/bin/bash

gitlab-runner register                             \
  --name="Bind-Mount Runner X"                     \
  --docker-volumes="/host/path:/container/path:rw" \
...

gitlab-runner register                                 \
  --name="Bind-Mount Runner Y"                         \
  --docker-volumes="/host/path:/container/alt/path:rw" \
...

local / multiple runners / one host / all branch / all jobs

Distributed cache

If you want to have a shared cache between all your jobs running on multiple runners and hosts, use the [runner.cache] section in your config.toml:

[[runners]]
  name = "Distributed-Cache Runner"
...
  [runners.cache]
    Type = "s3"
    Path = "bucket/path/prefix"
    Shared = true
    [runners.cache.s3]
      ServerAddress = "s3.amazonaws.com"
      AccessKey = "<changeme>"
      SecretKey = "<changeme>"
      BucketName = "foobar"
      BucketLocation = "us-east-1"

remote / multiple runners / multiple hosts / all branch / all jobs

Using the predefined variable CI_COMMIT_REF_NAME as the cache key you can ensure the cache is tied to a specific branch between multiple runners and hosts:

remote / multiple runners / multiple hosts / one branch / all jobs

Real-life setup

The above assumptions allowed you to harness your understanding of the concepts and possibilities.

In real life, you'll face more complex wiring and we hope this article will help you as a visual cheatsheet along with the reference documentation.

Just to give you a sneak peek, here is an exercise for you:

Real-life test assignment

Happy caching, folks!

Cover image by Alina Grubnyak on Unsplash

“Understand GitLab CI/CD caching types and setup with this visual guide” – Matthieu Fronton

Click to tweet

Edit this page View source