Published on: June 5, 2015

4 min read

How GitLab uses Unicorn and unicorn-worker-killer

We just wrote some new documentation on how Gitlab uses Unicorn and unicorn-worker-killer, available on doc.gitlab.com. Read here!

We just wrote some new documentation on how Gitlab uses Unicorn and

unicorn-worker-killer, available on doc.gitlab.com but

also included below. We would love to hear from the community if you have other

questions so we can improve this documentation resource!

Update 19:29 CEST: made link to doc.gitlab.com more specific.

Understanding Unicorn and unicorn-worker-killer

Unicorn

GitLab uses Unicorn, a pre-forking Ruby web

server, to handle web requests (web browsers and Git HTTP clients). Unicorn is

a daemon written in Ruby and C that can load and run a Ruby on Rails

application; in our case the Rails application is GitLab Community Edition or

GitLab Enterprise Edition.

Unicorn has a multi-process architecture to make better use of available CPU

cores (processes can run on different cores) and to have stronger fault

tolerance (most failures stay isolated in only one process and cannot take down

GitLab entirely). On startup, the Unicorn 'master' process loads a clean Ruby

environment with the GitLab application code, and then spawns 'workers' which

inherit this clean initial environment. The 'master' never handles any

requests, that is left to the workers. The operating system network stack

queues incoming requests and distributes them among the workers.

In a perfect world, the master would spawn its pool of workers once, and then

the workers handle incoming web requests one after another until the end of

time. In reality, worker processes can crash or time out: if the master notices

that a worker takes too long to handle a request it will terminate the worker

process with SIGKILL ('kill -9'). No matter how the worker process ended, the

master process will replace it with a new 'clean' process again. Unicorn is

designed to be able to replace 'crashed' workers without dropping user

requests.

This is what a Unicorn worker timeout looks like in unicorn_stderr.log. The

master process has PID 56227 below.


[2015-06-05T10:58:08.660325 #56227] ERROR -- : worker=10 PID:53009 timeout
(61s > 60s), killing

[2015-06-05T10:58:08.699360 #56227] ERROR -- : reaped #<Process::Status: pid
53009 SIGKILL (signal 9)> worker=10

[2015-06-05T10:58:08.708141 #62538]  INFO -- : worker=10 spawned pid=62538

[2015-06-05T10:58:08.708824 #62538]  INFO -- : worker=10 ready

Tunables

The main tunables for Unicorn are the number of worker processes and the

request timeout after which the Unicorn master terminates a worker process.

See the [omnibus-gitlab Unicorn settings

documentation](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/doc/settings/unicorn.md)

if you want to adjust these settings.

unicorn-worker-killer

GitLab has memory leaks. These memory leaks manifest themselves in long-running

processes, such as Unicorn workers. (The Unicorn master process is not known to

leak memory, probably because it does not handle user requests.)

To make these memory leaks manageable, GitLab comes with the

unicorn-worker-killer gem. This

gem monkey-patches the Unicorn

workers to do a memory self-check after every 16 requests. If the memory of the

Unicorn worker exceeds a pre-set limit then the worker process exits. The

Unicorn master then automatically replaces the worker process.

This is a robust way to handle memory leaks: Unicorn is designed to handle

workers that 'crash' so no user requests will be dropped. The

unicorn-worker-killer gem is designed to only terminate a worker process _in

between requests_, so no user requests are affected.

This is what a Unicorn worker memory restart looks like in unicorn_stderr.log.

You see that worker 4 (PID 125918) is inspecting itself and decides to exit.

The threshold memory value was 254802235 bytes, about 250MB. With GitLab this

threshold is a random value between 200 and 250 MB. The master process (PID

  1. then reaps the worker process and spawns a new 'worker 4' with PID

[2015-06-05T12:07:41.828374 #125918]  WARN -- :
#<Unicorn::HttpServer:0x00000002734770>: worker (pid: 125918) exceeds memory
limit (256413696 bytes > 254802235 bytes)

[2015-06-05T12:07:41.828472 #125918]  WARN -- : Unicorn::WorkerKiller send
SIGQUIT (pid: 125918) alive: 23 sec (trial 1)

[2015-06-05T12:07:42.025916 #117565]  INFO -- : reaped #<Process::Status:
pid 125918 exit 0> worker=4

[2015-06-05T12:07:42.034527 #127549]  INFO -- : worker=4 spawned pid=127549

[2015-06-05T12:07:42.035217 #127549]  INFO -- : worker=4 ready

One other thing that stands out in the log snippet above, taken from

Gitlab.com, is that 'worker 4' was serving requests for only 23 seconds. This

is a normal value for our current GitLab.com setup and traffic.

The high frequency of Unicorn memory restarts on some GitLab sites can be a

source of confusion for administrators. Usually they are a [red

herring](http://en.wikipedia.org/wiki/Red_herring).

We want to hear from you

Enjoyed reading this blog post or have questions or feedback? Share your thoughts by creating a new topic in the GitLab community forum.
Share your feedback

50%+ of the Fortune 100 trust GitLab

Start shipping better software faster

See what your team can do with the intelligent

DevSecOps platform.