Gitlab hero border pattern left svg Gitlab hero border pattern right svg


Meta issue to track various issues listed here is at on the infrastructure tracker.

Speed index


Performance of GitLab and is ultimately about the user experience. As also described in the product management handbook, "faster applications are better applications".

Our target is: an average Speed Index of less than 2 seconds for

The Speed Index is "the average time at which visible parts of the page are displayed".

There are many other performance metrics that can be useful in analyzing and prioritizing work, some of those are discussed in the sections below. But the user experienced Speed Index is the target for the site as a whole, and should be what everything ties back to in the end.

In everything that is to follow, times are measured from a single geo-location (in Europe) using "Cable" connectivity for that location (5 /1 Mbps).

Past and Current Performance

The URLs from listed in the table below form the basis for measuring performance improvements - these are heavy use cases. The times indicate time passed from web request to "the average time at which visible parts of the page are displayed" (per the definition of Speed Index). Since the "user" of these URLs is a controlled entity in this case, it represents an external measure of "Speed Index".

Type 2018-04 2019-09 2020-02 Now*
Issue List: GitLab FOSS Issue List 2872 1197 - N/A
Issue List: GitLab Issue List     1581
Issue: GitLab FOSS #4058 2414 1332 1954
Issue Boards: GitLab FOSS repo boards 3295 1773 - N/A
Issue Boards: GitLab repo boards     2619
Merge request: GitLab FOSS !9546 27644 2450 1937
Pipelines: GitLab FOSS pipelines 1965 4098 - N/A
Pipelines: GitLab pipelines     4289
Pipeline: GitLab FOSS pipeline 9360254 4131 2672 2546
Project: GitLab FOSS project 3909 1863 - N/A
Project: GitLab project     1533
Repository: GitLab FOSS Repository 3149 1571 - N/A
Repository: GitLab Repository     1867
Single File: GitLab FOSS Single File Repository 2000 1292 - N/A
Single File: GitLab Single File Repository     2012
Explore: GitLab explore 2346 1354 1336
Snippet: GitLab Snippet 1662597 1681 1082 1378

*To access the sitespeed grafana dashboards you need to be logged into your Google account

Note: Since this table spans time before and after single-codebase we kept GitLab FOSS pages close to GitLab ones to enable comparisons despite not being exactly the same project.

All Sitespeed Dashboards

Sitespeed - Site summary

Sitespeed - Page summary

Sitespeed - Page timing summaries

If you activate the runs toggle you will have annotations with links to all full reports. Currently we are running measurements every 2 hours.


Web Request

All items that start with the tachometer () symbol represent a step in the flow that we measure. Wherever possible, the tachometer icon links to the relevant dashboard in our monitoring. Each step in the listing below links back to its corresponding entry in the goals table.

Consider the scenario of a user opening their browser, and surfing to their dashboard by typing, here is what happens:

  1. User request
    1. User enters in their browser and hits enter
    2. Lookup IP in DNS (not measured)
      • Browser looks up IP address in DNS server
      • DNS request goes out and comes back (typically ~10-20 ms, [data?]; often times it is already cached so then it would be faster).
      • For more details on the steps from browser to application, enjoy reading
    3. Browser to Azure LB (not measured)
      • Now that the browser knows where to find the IP address, browser sends the web request (for to Azure's load balancer (LB).
  2. Backend processes
    1. Azure LB to HAProxy (not measured)
      • Azure's load balancer determines where to route the packet (request), and sends the request to our Frontend Load Balancer(s) (also referred to as HAProxy).
    2. HAProxy SSL with browser (not measured)
      • HAProxy (load balancer) does SSL negotiation with the browser
    3. HAProxy to NGINX (not measured)
      • HAProxy forwards the request to NGINX in one of our front end workers. In this case, since we are tracking a web request, it would be the NGINX box in the "Web" box in the production-architecture diagram; but alternatively the request can come in via API or a git command from the command line, hence the API, and git "boxes" in that diagram.
      • Since all of our servers are in ONE Azure VNET, the overhead of SSL handshake and teardown between HAProxy and NGINX should be close to negligible.
    4. NGINX buffers request (not measured)
      • NGINX gathers all network packets related to the request ("request buffering"). The request may be split into multiple packets by the intervening network, for more on that, read up on MTUs.
      • In other flows, this won't be true. Specifically, request buffering is switched off for LFS.
    5. NGINX to Workhorse (not measured)
      • NGINX forwards the full request to Workhorse (in one combined request).
    6. Workhorse distributes request
      • Workhorse splits the request into parts to forward to:
      • Unicorn. Time spent waiting for Unicorn to pick up a request is HTTP queue time.
      • Gitaly [not in this scenario, but not measured in any case]
      • NFS (git clone through HTTP) [not in this scenario, but not measured in any case]
      • Redis (long polling) [not in this scenario, but not measured in any case]
    7. Unicorn calls services
      • Unicorn, (often just called "Rails", or "application server"), translates the request into a Rails controller request; in this case RootController#index. The round trip time it takes for a request to start in Unicorn and leave Unicorn is what we call Transaction Timings. RailsController requests are sent to (and data is received from):
      • PostgreSQL (SQL timings),
      • NFS (git timings),
      • Redis (cache timings).
      • In this example, the controller addresses all three .
      • There are usually multiple SQL calls (or file, or cache, etc.) calls for a given controller request. These add to the overall timing, especially since they are sequential. For example, in this scenario, there are 29 SQL calls (search for Load) when this particular user hits The number of SQL calls will depend on how many projects the person has, how much may already be in cache, etc.
      • Rails tackles the steps within a controller request sequentially. In other words if it needs to make calls out to the database and to git, it is not set up to those in parallel but rather has to wait for the response to the first step before proceeding to the next step.
      • In the Rails stack, middleware typically adds to the number of round trips to Redis, NFS, and PostgreSQL, per controller call, in addition to the timings of Rails controllers. Middleware is used for {session state, user identity, endpoint authorization, rate limiting, logging, etc} while the controllers typically have at least one round trip for each of {retrieve settings, cache check, build model views, cache store, etc.}. Each such roundtrip is estimated to take < 10 ms.
    8. Unicorn constructs Views
      • The construction of views can take a long time (view timings). In some controllers, data is gathered first after which a view is constructed. In other controllers, data is gathered from within a View, so that the view timing in those cases includes the time it took to call NFS, PostgreSQL, Redis, etc. And in many cases, both are done.
      • A particular view in Rails will often be constructed from multiple partial views. These will be used from a template file, specified by the controller action, that is, itself, generally included within a layout template. Partials can include other partials. This is done for good code organization and reuse. As an example, when the particular user from the example above loads, there are 56 nested / partial views rendered (search for View::)
      • Partial views may be cached via various Rails techniques, such as Fragment Caching. In addition, GitLab has a Markdown cache stored in the database that is used to speed up the conversion of Markdown to HTML.
      • Perceived performance in the way of First Paint can be affected by how much of the content of a view is rendered by the backend vs. sending a "minimal" html blob to the user and relying on Javascript / AJAX / etc. to fetch additional elements that take the page from First Paint to "Fully Loaded". See the section about the frontend for more on this.
    9. Unicorn makes HTML (not measured)
      • Once the Views are built, Unicorn completes making the "HTML blob" that is then returned to the browser.
      • Some of these blobs are expensive to compute, and are sometimes hard-coded to be sent from Unicorn to Redis (i.e. to cache) once rendered.
    10. HTML to Browser (not measured)
  3. Render Page
    1. First Byte
      • The time when the browser receives the first byte. In addition to everything in the backend, this also depends on network speed. In the dashboard linked to by the tachometer above, First Byte is measured from a Digital Ocean box in the US with relatively little network lag thus representing an estimate of internal First Byte. Past performance on first byte is recorded elsewhere on this page.
      • For any page, you can use your browser's "inspect" tool to look at "TTFB" (time to first byte).
      • First Byte - External is measured for a hand selected number of URLs using SiteSpeed
    2. Speed Index
      • Browser parses the HTML blob and sends out further requests to to fetch assets such as javascript bundles, CSS, images, and webfonts.
      • The timing of this step depends (amongst other things) on the number and the size of assets, as well as network speed. For each static asset, there is a round-trip of:
        • for cached assets: browser nginx nginx confirms cached asset is still valid browser
        • for non-cached or expired cached assets: browser workhorse workhorse grabs asset from local cache browser.
        • for a page that is served through GitLab Pages: browser pages daemon (independent service in the architecture) browser.
      • Stylesheets can block page rendering by default, which can lead to unnecessary delays in page rendering.
      • Starting in 9.5, scripts won't block rendering anymore as they are loaded with defer="true", so they are parsed and executed in the same order as they are called but only after html + css has been rendered.
      • Enough meaningful content is rendered on screen to calculated the "Speed Index".
    3. Fully Loaded
      • When the scripts are loaded, Javascript compiles and evaluates them within the page.
      • On some pages, we use AJAX to allow for async loading. The AJAX call can be triggered by all kinds of things; for example a frontend element (button) or e.g. the DOMContentLoaded event. The new call is for a new URL, and such requests are routed either through the Web or API workers, invoke their respective Rails controllers on the backend, and return the requested files (HTML, JSON, etc). For example, the calendar and activity feeds on a username page are two separate AJAX calls, triggered by DOMContentLoaded. (The DOMContentLoaded event "marks the point when both the DOM is ready and there are no stylesheets that are blocking JavaScript execution" (taken from an article about the critical rendering path)). The alternative to using AJAX would be to include the full Rails code to generate the calendar and activity feed within the same controller that is called by the URL; which would lead to slower First Paint since it simply involves more calls to the database etc.

Git Commit Push

First read about the steps in a web request above, then pick up the thread here.

After pushing to a repository, e.g. from the web UI:

  1. In a web browser, make an edit to a repo file, type a commit message, and hit "Commit"
  2. NGINX receives the git commit and passes it to Workhorse
  3. Workhorse launches a git-receive-pack process (on the workhorse machine) to save the new commit to NFS
  4. On the workhorse machine, git-receive-pack fires a git hook to trigger GitLab Shell.
    • GitLab Shell accepts Git payloads pushed over SSH and acts upon them (e.g. by checking if you're authorized to perform the push, scheduling the data for processing, etc).
    • In this case, GitLab Shell provides the post-receive hook, and the git-receive-pack process passes along details of what was pushed to the repo to the post-receive hook. More specifically, it passes a list of three items: old revision, new revision, and ref (e.g. tag or branch) name.
  5. Workhorse then passes the post-receive hook to Redis, which is the Sidekiq queue.
    • Workhorse informed that the push succeeded or failed (could have failed due to the repo not available, Redis being down, etc.)
  6. Sidekiq picks up the job from Redis and removes the job from the queue
  7. Sidekiq updates PostgreSQL
  8. Unicorn can now query PostgreSQL.


Web Request

Consider the scenario of a user opening their browser, and surfing to their favorite URL on The steps are described in the section on "web request". In this table, the steps are measured and goals for improvement are set.

Guide to this table:

Step # per request p99 Q2-17 p99 Now p99 Q3-17 goal Issue links and impact
USER REQUEST          
Lookup IP in DNS 1 ~10 ? ~10 Use a second DNS provider
Browser to Azure LB 1 ~10 ? ~10  
BACKEND PROCESSES         Extend monitoring horizon
Azure LB to HAProxy 1 ~2 ? ~2  
HAProxy SSL with Browser 1 ~10 ? ~10 Speed up SSL
HAProxy to NGINX 1 ~2 ? ~2  
NGINX buffers request 1 ~10 ? ~10  
NGINX to Workhorse 1 ~2 ? ~2  
Workhorse distributes request 1       Adding monitoring to workhorse
    Workhorse to Unicorn 1 18 10 Adding Unicorns
    Workhorse to Gitaly     ?    
    Workhorse to NFS     ?    
    Workhorse to Redis     ?    
Unicorn calls services 1 2500 1000 Allow more GitLab internals monitoring
    Unicorn Postgres   250 100 Speed up slow queries
    Unicorn NFS   460 200 Move to Gitaly - sample result
    Unicorn Redis   18    
Unicorn constructs Views   1500    
Unicorn makes HTML          
HTML to Browser          
    Unicorn to Workhorse 1 ~2 ? ~2  
    Workhorse to NGINX 1 ~2 ? ~2  
    NGINX to HAProxy 1 ~2 ? ~2 Compress HTML in NGINX
    HAProxy to Azure LB 1 ~2 ? ~2  
    Azure LB to Browser 1 ~20 ? ~20  
RENDER PAGE          
FIRST BYTE (see note 1)]   1080 - 6347 1000  
SPEED INDEX (see note 2)   3230 - 14454 2000 Remove inline scripts, Defer script loading when possible, Lazy load images, Set up a CDN for faster asset loading, Use image resizing in CDN
Fully Loaded (see note)   6093 - 14003 not specified Enable webpack code splitting


Git Commit Push

Table to be built; merge requests welcome!


For any performance metric, the following modifiers can be applied:

First byte


Timing history for First Byte are listed in the table below (click on the tachometer icons for current timings). All times are in milliseconds.

Type End of Q4-17 Now
Issue: GitLab CE #4058 857
Merge request: GitLab CE !9546 18673
Pipeline: [GitLab CE pipeline 9360254] 1529
Repo: GitLab CE repo 1076


To go a little deeper and measure performance of the application & infrastructure without consideration for frontend and network aspects, we look at "transaction timings" as recorded by Unicorn. These timings can be seen on the Rails Controller dashboard per URL that is accessed .

For instance, to get the transaction timing for the merge request referenced above first visit the merge request page, then visit the Rails Controller dashboard and scroll down to the Transaction Details table. We do not currently have time series graphs per URL nor do we have specific targets in terms of what this timing should be.

Availability and Performance Prioritization


Issues with ~availability label directly impacts the availability of It is considered as another category of ~bug.

We categorize these issues based on the impact to's customer business goal and day to day workflow.

The prioritization scheme adheres to our product prioritization where security and availability work are prioritized over feature velocity.

The presence of these severity labels modifies the standard severity labels(~S1, ~S2, ~S3, ~S4) by additionally taking into account the impact as described below. The severity of these issues may change depending on the re-analysis of the impact to customers.

Severity Availability impact Reproducibility Time to resolve (TTR) Deployment target Minimum priority
~S1 Roadblock on and blocking customer's business goals and day to day workflow Consistently reproducible Within 48 hrs Hotfix to ~P1
~S2 Significant impact on and customer's day-to-day workflow. Customers have an acceptable workaround in place. Consistently reproducible Within 5 business days Next deployment window after resolution ~P1
~S3 Broad impact on and minor inconvenience to customer's day-to-day workflow. No workaround needed. Inconsistently reproducible Within 30 days Next release after resolution ~P2
~S4 Minimal impact on, no known customers affected Inconsistently reproducible 60 days Next release after resolution ~P3

To call out specifics on what priorities can be set on an availability issue, please refer to the prioritization band table below.

Issue with the labels Allowed priorities Not-allowed priorities
~availability ~S1 ~P1 only ~P2, ~P3, and ~P4
~availability ~S2 ~P1 only ~P2, ~P3, and ~P4
~availability ~S3 ~P2 as baseline, ~P1 allowed ~P3, and ~P4
~availability ~S4 ~P3 as baseline, ~P2 and ~P1 allowed ~P4


To clarify the priority of issues that relate to's performance you should add the ~performance label, as well as a "Severity"

label. There are two factors that influence which severity label you should pick:

  1. How frequently something is used.
  2. How likely it is for something to cause an outage.

For strictly performance related work you can use the Controller Timings Overview Grafana dashboard. This dashboard categorises data into three different categories, each with their associated severity label:

  1. Frequently Used: ~S2
  2. Commonly Used: ~S3
  3. Rarely Used: ~S4

This means that if a controller (e.g. UsersController#show) is in the "Frequently Used" category you assign it the ~S2 label.

For database related timings you can also use the SQL Timings Overview. This is the dashboard primarily used by the Database Team to determine the AP label to use for database related performance work.

Database Performance

Some general notes about parameters that affect database performance, at a very crude level.