As of May 2019,
share-01, an NFS file server that's meant for sharing files
between different parts of the fleet, is a SPOF for GitLab.com. If the server was
taken down accidentally or for maintenance, GitLab.com would incur downtime until it is up and
running again. Most of the application features that were using
(uploads, LFS objects, artifacts) have been migrated to allow for using a cloud storage
instead, however, there are still a few features that still needs similar
migration before we can drop using
The following mount points are mounted for each client (web, api, git, sidekiq) in our fleet, only the first 3 are no longer being used for sharing purposes and can be unmounted safely:
The following features still use
share-01, and for each one we would implement
a solution to allow using a cloud storage.
Attachments to personal snippets are only uploaded to
unlike attachments to personal snippets comments, which are uploaded to cloud
storage. The progress for this is being tracked in an issue.
This feature will not necessarily be broken with the absence of a shared file server, but it means that Gitaly would generate an archive every time one is requested, which can be put some Gitaly nodes under stress for large and popular repositories. An effort to improve NGINX caching and/or serve immutable objects (e.g. a repository archive at certain SHA1, a raw blob) from a CDN is being tracked in an issue.
As this is an application change, it will be tested using application specs and further verified by QA tests.
As there are no infrastructure changes involved in this design, it is expected to work the same for GitLab.com and self-managed installations.
With availability as the main motivation behind this design, there are highly-available alternatives that we could use instead of self-managed file server. Cloud Storage FUSE and Cloud Filestore are both provided by GCP and allows us not to change the application right away while still having high-availability guarantees. Such solutions, however, pose some disadvantages as some self-managed clients may not have access to such products in their hosting environment. Also, using such solutions may complicate our move towards Kubernetes.