Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Troubleshooting GitLab Cloud Native chart deployments

What is KubeSOS

KubeSos is a tool that uses kubectl and helm to retrieve GitLab cluster configuration and logs from GitLab Cloud Native chart deployments. This information is then zipped into a tar file and shared with the support team to help with troubleshooting GitLab deployments.

Requirements

Usage

You can either download the script:

wget https://gitlab.com/gitlab-com/support/toolbox/kubesos/-/raw/master/kubeSOS.sh

chmod +x kubeSOS.sh
./kubeSOS.sh [flags]

Flags Description Required Default
-n namespace No "default"
-r helm chart release No "gitlab"

Or use curl:

curl https://gitlab.com/gitlab-com/support/toolbox/kubesos/raw/master/kubeSOS.sh | bash -s -- [flags]

Data will be archived to kubesos-<timestamp>.tar.gz

Extracting the archive

Use the tar linux utility to extract the data into a folder

tar -zxvf kubesos-<timestamp>.tar.gz

Troubleshoot a Gitlab installation

There are two main areas to check when troubleshooting a cloud native application like GitLab:

Cluster setup

We recommend a cluster with 8vCPU and 30GB of RAM so a few things to checks is if the nodes have enough resources. Use unix commands like top, free to confirm this.

Check if the nodes are registered correctly and verify that all of the nodes you expect to see are present and that they are all in the Ready state.

kubectl get nodes

To get detailed information about the overall health of your cluster, use the following command:

kubectl cluster-info dump

To delve deeper into troubleshooting the cluster have a look at Troubleshoot Clusters which gives you insights as to the logs that you would look into.

Gitlab Requirements

In order to deploy GitLab on Kubernetes, ensure the setup meets the following:

Going back to our generated KubeSos output, confirm by checking:

kubectl-check

To check the version of kubectl installed

% more kubectl-check

Helm version

% more helm-version

Debugging Pods

Check the current state of the pod through checking the get_pods file. All pods should be running or completed.

 % more get_pods
NAME                                                   READY   STATUS      RESTARTS   AGE
gitlab-gitaly-0                                        1/1     Running     0          12m
gitlab-gitlab-exporter-586ccff5fb-6g67t                1/1     Running     0          12m
gitlab-gitlab-grafana-app-6bf7db585b-75fgt             2/2     Running     0          12m
gitlab-gitlab-shell-7547c6457-54xm8                    1/1     Running     0          12m
gitlab-gitlab-shell-7547c6457-lmfb2                    1/1     Running     0          12m
gitlab-migrations-76-gq7pq                             0/1     Completed   0          12m
gitlab-nginx-ingress-controller-786f5f5ddd-ktwh5       1/1     Running     0          12m
gitlab-nginx-ingress-controller-786f5f5ddd-nqk2w       1/1     Running     0          12m
gitlab-nginx-ingress-default-backend-7ff88b95f-rv6jt   1/1     Running     0          12m
gitlab-prometheus-server-5b47c879b4-g9xvb              2/2     Running     0          12m
gitlab-redis-master-0                                  2/2     Running     0          12m
gitlab-registry-7f874b748d-bxtnh                       1/1     Running     0          12m
gitlab-registry-7f874b748d-rs246                       1/1     Running     0          12m
gitlab-sidekiq-all-in-1-v1-95cbfd5f7-546sq             1/1     Running     0          12m
gitlab-task-runner-6f7dcc8897-psscr                    1/1     Running     0          12m
gitlab-webservice-default-659fdddb9b-cmrbd             2/2     Running     0          12m
gitlab-webservice-default-659fdddb9b-hhctp             2/2     Running     0          12m

Any pod in pending status indicates a possible problem which one can confirm by checking the recent events from the describe_pods file. If a pod is stuck in Pending it means that it can not be scheduled onto a node. This could be due to lack of resources such as CPU or Memory in your cluster. More on this in Debugging Pods

Services

For services the main thing to confirm is if the loadbalancer has been assigned an External IP and is not in pending state.

% more get_services | grep gitlab-nginx-ingress-controller | grep LoadBalancer

gitlab-nginx-ingress-controller LoadBalancer   172.20.34.155   36.0.0.25   80:32220/TCP,443:30038/TCP,22:30963/TCP   45d

or in AWS

% more get_services | grep gitlab-nginx-ingress-controller | grep LoadBalancer

gitlab-nginx-ingress-controller LoadBalancer   172.20.164.155   a48.eu-west-5.elb.amazonaws.com   80:32220/TCP,443:30038/TCP,22:30963/TCP   45d

Further checks would involve confirming if all the services have been assigned an endpoint

% more endpoints
NAME                                      ENDPOINTS                                           AGE
gitlab-cert-manager                       10.16.4.130:9402                                    23h
gitlab-gitaly                             10.16.4.164:8075                                    23h
gitlab-gitlab-exporter                    10.16.4.155:9168                                    23h
gitlab-gitlab-pages                       10.16.4.156:8090                                    23h
gitlab-gitlab-shell                       10.16.2.36:2222,10.16.4.162:2222                    23h
gitlab-grafana-app                        10.16.4.131:3000                                    23h

Ingress

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource. Confirm if the hosts and address are configured correctly and if the ingress has been assigned an IP.

% kubectl get ingress
% more describe_ingress

Deployments

To quickly confirm the applications that are setup, check the following file:

% more get_deployments
NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
gitlab-cainjector                      1/1     1            1           9d
gitlab-cert-manager                    1/1     1            1           9d
gitlab-gitlab-exporter                 1/1     1            1           9d
gitlab-gitlab-pages                    1/1     1            1           9d
gitlab-gitlab-runner                   1/1     1            1           9d
gitlab-gitlab-shell                    2/2     2            2           9d
gitlab-grafana-app                     1/1     1            1           9d
gitlab-kas                             2/2     2            2           9d
gitlab-minio                           1/1     1            1           9d
gitlab-nginx-ingress-controller        2/2     2            2           9d
gitlab-nginx-ingress-default-backend   1/1     1            1           9d
gitlab-prometheus-server               1/1     1            1           9d
gitlab-registry                        2/2     2            2           9d
gitlab-sidekiq-all-in-1-v1             1/1     1            1           9d
gitlab-task-runner                     1/1     1            1           9d
gitlab-webservice-default              2/2     2            2           9d

If any of the deployments are not ready use the describe_deployments file to check the reason for failure. It's is also worth checking for errors in the describe_pods file.

% more describe_deployments

Persistent Volumes and Claims

GitLab uses persistent volumes to store data so if any of the pods are in pending status, check if the volumes exist and their status is Bound. Confirm amount of space allocated for each and if required allocate more resources.

% more get_pvc
NAME                            STATUS   VOLUME        CAPACITY ACCESS MODES   STORAGECLASS   AGE
data-gitlab-postgresql-0        Bound    pvc-44c3643e-  8Gi        RWO         standard       9d
gitlab-minio                    Bound    pvc-8a739402-  10Gi       RWO         standard       9d
gitlab-prometheus-server        Bound    pvc-29fc7b9d-  8Gi        RWO         standard       9d
redis-data-gitlab-redis-master  Bound    pvc-b9b67a9d-  8Gi        RWO         standard       9d
repo-data-gitlab-gitaly-0       Bound    pvc-af7ca188-  50Gi       RWO         standard       9d

User supplied values

The user_supplied_values.yaml file lists all user supplied values that were set while installing GitLab. This is helpful in confirming if the supplied values are correct and will work as they override the chart defaults. The all_values.yaml file has all values that have been used to set up Gitlab.

Application logs

Finally it generates all the application logs which can be used to debug specific application issues.

% ls -alrt *.log
-rw-r--r--  1 staff  staff   100356 Jun 20 14:45 cainjector.log
-rw-r--r--  1 staff  staff    33000 Jun 20 14:45 cert-manager.log
-rw-r--r--  1 staff  staff  1274320 Jun 20 14:45 gitaly.log
-rw-r--r--  1 staff  staff    56873 Jun 20 14:45 gitlab-exporter.log
-rw-r--r--  1 staff  staff     1606 Jun 20 14:45 gitlab-gitlab-runner.log
-rw-r--r--  1 staff  staff      595 Jun 20 14:45 gitlab-pages.log
-rw-r--r--  1 staff  staff     4394 Jun 20 14:45 gitlab-shell.log
-rw-r--r--  1 staff  staff     1462 Jun 20 14:45 kas.log
-rw-r--r--  1 staff  staff        0 Jun 20 14:45 minio.log
-rw-r--r--  1 staff  staff  1203696 Jun 20 14:45 nginx-ingress.log
-rw-r--r--  1 staff  staff     3456 Jun 20 14:45 postgresql.log
-rw-r--r--  1 staff  staff    10789 Jun 20 14:45 prometheus.log
-rw-r--r--  1 staff  staff     4757 Jun 20 14:45 redis.log
-rw-r--r--  1 staff  staff     6158 Jun 20 14:45 registry.log
-rw-r--r--  1 staff  staff  1166489 Jun 20 14:45 sidekiq.log
-rw-r--r--  1 staff  staff     1955 Jun 20 14:45 task-runner.log
-rw-r--r--  1 staff  staff  4750701 Jun 20 14:45 webservice.log
Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license