Production Architecture

Our core infrastructure is currently hosted on several cloud providers, all with different functions. This document does not cover servers that are not integral to the public facing operations of GitLab.com.

On this page

Current Architecture

Source

Service Architecture

Source

Database Architecture

Source, GitLab internal use only

Storage Architecture

Source, GitLab internal use only

Monitoring Architecture

Source, GitLab internal use only

Network Architecture

Source, GitLab internal use only

Our network infrastructure consists of networks for each class of server as defined in the Current Architecture diagram. Each network contains a similar ruleset as defined above.

We currently peer our ops network. Inside of this network is most of our monitoring infrastructure where we allow InfluxDB and Prometheus data to flow in order to populate our metrics systems.

For alert management, we peer all of our networks together such that we have a cluster of alert managers to ensure we get alerts out no matter a potential failure of an environment.

No application or customer data flows through these network peers.

Host Naming Standards

Hostnames

A hostname shall be constructed by using the service offered by that node, followed by a dash, and a two digit incrementing number.

i.e.: sidekiq-NN, git-NN, web-NN

Service specific identifiers, when it connotes a difference in build or function, will be identified as -specific and precede the two digit numeric

i.e.: sidekiq-realtime-01

Service Tiers

Following the hostname shall be the service tier that the node belongs in:

Environments

Following the service tier shall be the environment:

TLD Zones

When it comes to DNS names all services providing GitLab as a service shall be in the gitlab.com domain, ancillary services in the support of GitLab (i.e. Chef, ChatOps, VPN, Logging, Monitoring) shall be in the gitlab.net domain.

Internal Networking Scheme

We leverage the use of VPC's greatly. You can see how we configure these for each of our environments and servers in our terraform repo.

Remote Access

Access is granted to only those whom need access to production. At this point in time we utilize bastion hosts. Instructions for requesting and using the bastion hosts can be found in our runbooks

Secrets Management

GitLab utilizes two different secret management approaches, GKMS for machine in side of Google Cloud Provider, and Chef Encrypted Data Bags for all other host secrets.

GKMS Secrets

Secrets are divided up based upon the Chef role that will be requiring them (i.e. Load Balancers, Sidekiq, Storage) and are arranged in JSON files. The JSON files are encrypted and stored in Google Cloud Storage (GCS) with access restriction being limited to the environment consuming the keys (i.e. production servers only have access to the production GCS storage bucket). The JSON files are encrypted with GKMS keys that are managed by the GKMS service.

Node Secret Execution

When a node performs a chef run it pulls the encrypted JSON file out of GCS, makes a request to the GKMS system as the node, requesting a key to decrypt an object, and since the nodes have permission the JSON file is decrypted and read into memory of the current Chef process, making it available for Chef parsing where the secrets are applied to templates and scripts. Keys are auto-rotated every 90 days.

Chef Encrypted Data Bags

Secrets are again divided up based upon the Chef role that will be requiring them and are arranged in JSON structured files. These files are then encrypted and signed with the individual Chef administrator keys, and the client node keys that need to have access.

Node Secret Execution

During a Chef run the client node requests the encrypted data bag from the Chef server, uses it's own private key to decrypt the contents, and then applies them to the configuration templates and scripts. Keys are manually rotated roughly every 90 days or whenever we make an change to the Chef administrators, whichever comes first.

Azure

Azure is where we have lingering infrastructure. Remaining servers exist here for a wide variety of reasons.

Digital Ocean

Digital Ocean houses several servers that do not need to directly interact with our main infrastructure.

AWS

We host our DNS with route53 and we have several EC2 instances for various purposes. The servers you will interact with most are listed below:

Monitoring

See how it's doing, for more information on that, visit the monitoring handbook.

Technology at GitLab

We use a lot of cool (but boring) technologies here at GitLab. Below is a non-exhaustive list of tech we use here.

Proposed Cloud Native Architecture

We are working on running GitLab.com on Kubernetes by containerizing all the different services and components that are necessary to run GitLab-EE at GitLab.com scale.

This is the proposed architecture to move from what we are running in static VMs to a container orchestration managed world.

Pods Definition

Source, GitLab internal use only