Learn how we're building a new toolkit to help with performance testing and deploying GitLab at scale.
Last year I wrote about how the Quality Engineering Enablement team was building up the performance testing of GitLab with the GitLab Performance Tool (GPT). Last year, the biggest challenge with performance testing wasn't so much the testing but rather setting up the right large scale GitLab environments to test against.
Like any server application, deploying at scale is challenging. That's why we built another toolkit that automates the deployment of GitLab at scale: The GitLab Environment Toolkit (GET).
GitLab Environment Toolkit logo
Internally called the "Performance Environment Builder" (PEB), GET grew alongside GPT as we continued to expand our performance testing efforts. Over time we built a toolkit that was quite capable in its own right of deploying GitLab at scale, which is why it started to gain attention internally from other teams and then even from some customers. Soon we realized we built something worth sharing.
The Quality Engineering Enablement team has been working hard over the last few months to polish the toolkit for broader use and we're happy to share that the first version of GET v1.0.0 has been released!
GET is a collection of well-known open source provisioning and configuration tools with a simple focused purpose - to deploy GitLab Omnibus and GitLab Helm Charts at scale, as defined by our Reference Architectures and Cloud Native Hybrid Reference Architectures. Built with Terraform and Ansible, GET supports the provisioning and configuring of machines and other related infrastructure and contains the following features:
We're just getting started with GET, and continue to add more support for features and different environment setups. Now that GET v1.0.0 has been released, we're at a good place for customers to start trialing and evaluating GET. We do ask that you take into consideration the continuing expansion of capabilities, as well as limitations of the current version.
Read on to learn about the the philosophy of GET and how it works.
Our team has past experience with provisioning and configuration tools, so we've learned what does and does not work, which is why we try to stick to the following goals:
Next we look at how GET works at a high level, starting with provisioning with Terraform.
The first step to building an environment is to provision the machines and/or Kubernetes clusters that run GitLab. We undergo this process with the well-known provisioning tool, Terraform.
Next, we've created multiple Terraform modules in GET for each of the main big three cloud providers (GCP, AWS and Azure) that provision machines for you, according to the appropriate reference architectures, along with the necessary supporting infrastructure, such as firewalls, load balancers, etc. We designed these modules to be as simple as possible and only require minimal configuration.
For more information on the entire Terraform configuration, check out our docs. An example of one of the main config files is
environment.tf, which defines how each component's nodes should be setup. Below is an example of how it is configured with GCP for a 10k reference architecture environment:
module "gitlab_ref_arch_gcp" {
source = "../../modules/gitlab_ref_arch_gcp"
prefix = var.prefix
project = var.project
object_storage_buckets = ["artifacts", "backups", "dependency-proxy", "lfs", "mr-diffs", "packages", "terraform-state", "uploads"]
# 10k
consul_node_count = 3
consul_machine_type = "n1-highcpu-2"
elastic_node_count = 3
elastic_machine_type = "n1-highcpu-16"
gitaly_node_count = 3
gitaly_machine_type = "n1-standard-16"
praefect_node_count = 3
praefect_machine_type = "n1-highcpu-2"
praefect_postgres_node_count = 1
praefect_postgres_machine_type = "n1-highcpu-2"
gitlab_nfs_node_count = 1
gitlab_nfs_machine_type = "n1-highcpu-4"
gitlab_rails_node_count = 3
gitlab_rails_machine_type = "n1-highcpu-32"
haproxy_external_node_count = 1
haproxy_external_machine_type = "n1-highcpu-2"
haproxy_external_external_ips = [var.external_ip]
haproxy_internal_node_count = 1
haproxy_internal_machine_type = "n1-highcpu-2"
monitor_node_count = 1
monitor_machine_type = "n1-highcpu-4"
pgbouncer_node_count = 3
pgbouncer_machine_type = "n1-highcpu-2"
postgres_node_count = 3
postgres_machine_type = "n1-standard-4"
redis_cache_node_count = 3
redis_cache_machine_type = "n1-standard-4"
redis_sentinel_cache_node_count = 3
redis_sentinel_cache_machine_type = "n1-standard-1"
redis_persistent_node_count = 3
redis_persistent_machine_type = "n1-standard-4"
redis_sentinel_persistent_node_count = 3
redis_sentinel_persistent_machine_type = "n1-standard-1"
sidekiq_node_count = 4
sidekiq_machine_type = "n1-standard-4"
}
output "gitlab_ref_arch_gcp" {
value = module.gitlab_ref_arch_gcp
}
With this environment and two other small config files in place Terraform can be run normally and work its magic. Below is a snippet of the output you'll see with GCP:
Once it's done, you should have a full set of machines for GitLab that can be configured with Ansible, which is what we'll look at next.
The next step for setting up the environment is configuring Ansible. In a nutshell, this tool connects to each machine via SSH and runs tasks to configure GitLab.
Like with Terraform, we've created multiple roles and Playbooks in GET that are designed to configure each component on the intended machine. Through Terraform, we apply labels to each machine that Ansible then tracks using its dynamic inventory to define the purpose of each machine.
A detailed breakdown of the configuration process is available in the GET for Ansible docs. But, an example one of the main config files is
environment.tf, which defines how the nodes of each component should be setup. Below is an example of how it looks with GCP for a 10k user reference architecture environment:
Like we did before with Terraform, we'll highlight one of the main config files, but you can see the full process in the docs. The file is
vars.yml, an inventory variable file for your environment that contains various parts of the config Ansible needs to perform the setup, along with key GitLab config:
all:
vars:
# Ansible Settings
ansible_user: "<ssh_username>"
ansible_ssh_private_key_file: "<private_ssh_key_path>"
# Cloud Settings
cloud_provider: "gcp"
gcp_project: "<gcp_project_id>"
gcp_service_account_host_file: "<gcp_service_account_host_file_path>"
# General Settings
prefix: "<environment_prefix>"
external_url: "<external_url>"
gitlab_license_file: "<gitlab_license_file_path>"
# Object Storage Settings
gitlab_object_storage_artifacts_bucket: "{{ prefix }}-artifacts"
gitlab_object_storage_backups_bucket: "{{ prefix }}-backups"
gitlab_object_storage_dependency_proxy_bucket: "{{ prefix }}-dependency-proxy"
gitlab_object_storage_external_diffs_bucket: "{{ prefix }}-mr-diffs"
gitlab_object_storage_lfs_bucket: "{{ prefix }}-lfs"
gitlab_object_storage_packages_bucket: "{{ prefix }}-packages"
gitlab_object_storage_terraform_state_bucket: "{{ prefix }}-terraform-state"
gitlab_object_storage_uploads_bucket: "{{ prefix }}-uploads"
# Passwords / Secrets - Can also be set as Environment Variables via ansible.builtin.env
gitlab_root_password: "<gitlab_root_password>"
grafana_password: "<grafana_password>"
postgres_password: "<postgres_password>"
consul_database_password: "<consul_database_password>"
gitaly_token: "<gitaly_token>"
pgbouncer_password: "<pgbouncer_password>"
redis_password: "<redis_password>"
praefect_external_token: "<praefect_external_token>"
praefect_internal_token: "<praefect_internal_token>"
praefect_postgres_password: "<praefect_postgres_password>"
With the variable file and the environment inventory configured Ansible can run normally. Here is a snippet of the output you'll see with GCP:
Once Ansible is done, you should have a fully running GitLab environment at scale!
We've got a bunch of things planned for GET so it can support more features when setting up GitLab, such as SSL support, cloud native hybrid architectures on other cloud providers, object storage customization, and much more. We know deploying production-ready server applications is hard and has many potential requirements depending on the customer, and we hope to eventually support all recommended setups.
Check out the GET development board and our issue list to see what is in progress. Share feedback and suggestions by adding to our issue lists, we're keen to hear what's important to customers.
Cover image by Jean Vella on Unsplash.
