Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Object Storage Working Group

On this page

Attributes

Property Value
Date Created November 3, 2021
Target End Date January 31, 2022
Slack #wg_object-storage (only accessible from within the company)
Google Doc Object Storage Working Group Meeting Agenda (only accessible from within the company)

Charter

GitLab stores three classes of user data: database records, Git repositories, and user uploaded files.

User experience, as well as contributors experience, with our file storage has room for significant improvement.

The working group will be reducing technical debt that has been accrued over the past few years, namely removing CarrierWave and not duplicating object storage clients in both Go and Ruby.

The working group is tasked with architecting a simplified Object Storage process and implimenting the new solution.

Business goal

Improve SaaS scalability, reliability and development speed making sure object storage is available for every type of upload.

Improve feature adoption for self-managed customers, providing a single bucket configuration that works out of the box.

Object storage is a key feature in GitLab that affects engineering groups across all sections. The outcome of the working group should also make it easier for engineers to contribute to the final solution.

Scope and definitions

Object storage is a fundamental component of GitLab, providing the underlying implementation for shared, distributed, highly-available (HA) file storage.

Over time, we have built support for object storage across the application, solving specific problems in multitude of iterations. This has led to increased complexity across the board, from development (new features and bug fixes) to installation:

Definitions

CarrierWave
A gem that provides a simple and extremely flexible way to upload files from Ruby applications. This was the boring solution when first implemented. However this is no longer our use-case, as we upload files from Workhorse, and we had to [patch CarrierWave's internals](https://gitlab.com/gitlab-org/gitlab/-/issues/285597#note_452696638) to support Direct Upload.
Direct upload
A technology we developed to intercept file uploads with Workhorse and handle the expensive upload operation in Workhorse, where it's cheaper. See our [uploads development documentation](https://docs.gitlab.com/ee/development/uploads.html#) for more details.

Kickoff video

Exit criteria

Roles and responsibilities

The functional leads will be responsible for:

Ideally, the functional lead is someone who is an IC working in the affected groups, but anyone capable of representing a group, department, or sub-department in the fashion mentioned above is welcome.

Working Group Role Person Stakeholder Dept. Title
Executive Sponsor Marin Jankovski @marin Infrastructure Director of Infrastructure, Platform
Facilitator Alessio Caiazza @nolith Infrastructure Staff Backend Engineer
Functional Lead Grzegorz Bizon @grzesiek Ops, Verify Staff Backend Engineer
Functional Lead Jason Plum @WarheadsSE Distribution Staff Backend Engineer
Functional Lead Matthias Käppler @mkaeppler Memory Senior Backend Engineer
Functional Lead Łukasz Korbasiewicz @lkorbasiewicz Support Support Engineer
Member Vladimir Shushlin @vshushlin Release group Senior Backend Engineer
Member Erick Bajao @iamricecake Verify Senior Backend Engineer
Member Jaime Martinez @jaime Package Backend Engineer
Member David Fernandez @10io Package Senior Backend Engineer
Member Tiger Watson @tigerwnz Configure Senior Backend Engineer
Member Vitor Meireles De Sousa @vdesousa AppSec Senior Application Security Engineer
Member Patrick Bajao @patrickbajao Workhorse Senior Backend Engineer
Member Catalin Irimie @cat Geo Senior Backend Engineer
Member Chad Woolley @cwoolley-gitlab Editor (Pages) Senior Backend (Fullstack) Engineer
Member Sofia Vistas @svistas Quality Senior Software Engineer in Test

Company efforts on uploads

At GitLab we work in iterations, direct upload was developed by several teams incrementally by adding new features over the course of several milestones.

To demonstrate the number of teams and milestones involved, the timeline of the Object Storage development, from feature development to tech debt and security fixes, is outlined:

Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license