Published on: June 25, 2025
14 min read
Discover how this new GitLab feature can find exact matches, use regex patterns, and see contextual results across terabytes of codebases.
TL;DR: What if you could find any line of code across 48 TB of repositories in milliseconds? GitLab's new Exact Code Search makes this possible, delivering pinpoint precision, powerful regex support, and contextual multi-line results that transform how teams work with large codebases.
Anyone who works with code knows the frustration of searching across repositories. Whether you're a developer debugging an issue, a DevOps engineer examining configurations, a security analyst searching for vulnerabilities, a technical writer updating documentation, or a manager reviewing implementation, you know exactly what you need, but traditional search tools often fail you.
These conventional tools return dozens of false positives, lack the context needed to understand results, and slow to a crawl as codebases grow. The result? Valuable time spent hunting for needles in haystacks instead of building, securing, or improving your software.
GitLab's code search functionality has historically been backed by Elasticsearch or OpenSearch. While these are excellent for searching issues, merge requests, comments, and other data containing natural language, they weren't specifically designed for code. After evaluating numerous options, we developed a better solution.
Enter GitLab's Exact Code Search, currently in beta testing and powered by Zoekt (pronounced "zookt", Dutch for "search"). Zoekt is an open-source code search engine originally created by Google and now maintained by Sourcegraph, specifically designed for fast, accurate code search at scale. We've enhanced it with GitLab-specific integrations, enterprise-scale improvements, and seamless permission system integration.
This feature revolutionizes how you find and understand code with three key capabilities:
1. Exact Match mode: Zero false positives
When toggled to Exact Match mode, the search engine returns only results that match your query exactly as entered, eliminating false positives. This precision is invaluable when:
2. Regular Expression mode: Powerful pattern matching
For complex search needs, Regular Expression mode allows you to craft sophisticated search patterns:
3. Multiple-line matches: See code in context
Instead of seeing just a single line with your matching term, you get the surrounding context that's crucial for understanding the code. This eliminates the need to click through to files for basic comprehension, significantly accelerating your workflow.
Let's see how these capabilities translate to real productivity gains in everyday development scenarios:
Before Exact Code Search: Copy an error message, search, wade through dozens of partial matches in comments and documentation, click through multiple files, and eventually find the actual code.
With Exact Code Search:
Impact: Reduce debugging time from minutes to seconds, eliminating the frustration of false positives.
Before Exact Code Search: Browse through directories, make educated guesses about file locations, open dozens of files, and slowly build a mental map of the codebase.
With Exact Code Search:
Impact: Build a mental map of code structure in minutes rather than hours, dramatically accelerating onboarding and cross-team collaboration.
Before Exact Code Search: Attempt to find all instances of a method, miss some occurrences, and introduce bugs through incomplete refactoring.
With Exact Code Search:
Impact: Eliminate the "missed instance" bugs that often plague refactoring efforts, improving code quality and reducing rework.
Security teams can:
Impact: Transform security audits from manual, error-prone processes to systematic, comprehensive reviews.
Search across your entire namespace or instance to:
Impact: Break down silos between projects and identify opportunities for code reuse and standardization.
Before diving into our scale achievements, let's explore what makes Zoekt fundamentally different from traditional search engines — and why it can find exact matches so incredibly fast.
Zoekt's speed comes from its use of positional trigrams — a technique that indexes every sequence of three characters along with their exact positions in files. This approach solves one of the biggest pain points developers have had with Elasticsearch-based code search: false positives.
Here's how it works:
Traditional full-text search engines like Elasticsearch tokenize code into words and lose positional information. When you search for getUserId()
, they might return results containing user, get, and Id scattered throughout a file — leading to those frustrating false positives for GitLab users.
Zoekt's positional trigrams maintain exact character sequences and their positions. When you search for getUserId()
, Zoekt looks for the exact trigrams like get, etU, tUs, Use, ser, erI, rId, Id(", "d(), all in the correct sequence and position. This ensures that only exact matches are returned.
The result? Search queries that previously returned hundreds of irrelevant results now return only the precise matches you're looking for. This was one of our most requested features for good reason - developers were losing significant time sifting through false positives.
Zoekt excels at exact matches and is optimized for regular expression searches. The engine uses sophisticated algorithms to convert regex patterns into efficient trigram queries when possible, maintaining speed even for complex patterns across terabytes of code.
Exact Code Search is powerful and built to handle massive scale with impressive performance. This is not just a new UI feature — it's powered by a completely reimagined backend architecture.
On GitLab.com alone, our Exact Code Search infrastructure indexes and searches over 48 TB of code data while maintaining lightning-fast response times. This scale represents millions of repositories across thousands of namespaces, all searchable within milliseconds. To put this in perspective: This scale represents more code than the entire Linux kernel, Android, and Chromium projects combined. Yet Exact Code Search can find a specific line across this massive codebase in milliseconds.
Our innovative implementation features:
This self-configuring architecture dramatically simplifies scaling. When more capacity is needed, administrators can simply add more nodes without complex reconfiguration.
Behind the scenes, Exact Code Search operates as a distributed system with these key components:
Note: High availability is built into the architecture but not yet fully enabled. See Issue 514736 for updates.
Exact Code Search automatically integrates with GitLab's permission system:
While Zoekt provided the core search technology, it was originally designed as a minimal library for managing .zoekt
index files - not a distributed database or enterprise-scale service. Here are the key engineering challenges we overcame to make it work at GitLab's scale"
The problem: Zoekt was designed to work with local index files, not distributed across multiple nodes serving many concurrent users.
Our solution: We built a comprehensive orchestration layer that:
The problem: How do you efficiently manage terabytes of index data across multiple nodes while ensuring fast updates?
Our solution: We implemented:
gitlab-zoekt
binary that can operate in both indexer and webserver modesThe problem: Zoekt had no concept of GitLab's complex permission system - users should only see results from projects they can access.
Our solution: We built native permission filtering directly into the search flow:
The problem: Managing a distributed search system shouldn't require a dedicated team.
Our solution:
Rolling out a completely new search backend to millions of users required careful planning. Here's how we minimized customer impact while ensuring reliability:
We started by enabling Exact Code Search only for the gitlab-org
group - our own internal repositories. This allowed us to:
Before expanding, we focused on ensuring the system could handle GitLab.com's scale:
We gradually expanded to customers interested in testing Exact Code Search:
gitlab-org/gitlab
now index in ~10 seconds)Today, over 99% of Premium and Ultimate licensed groups on GitLab.com have access to Exact Code Search. Users can:
Rolling this out gradually meant users didn't experience service disruptions, performance degradation, or feature gaps during the transition. We've already received positive feedback from users as they notice their results becoming more relevant and faster.
For technical deep dive: Interested in the detailed architecture and implementation? Check out our comprehensive design document for in-depth technical details about how we built this distributed search system.
Getting started with Exact Code Search is simple because it's already enabled by default for Premium and Ultimate groups on GitLab.com (over 99% of eligible groups currently have access).
Whether using Exact Match or Regular Expression mode, you can refine your search with modifiers:
Query Example | What It Does |
---|---|
file:js |
Searches only in files containing "js" in their name |
foo -bar |
Finds "foo" but excludes results with "bar" |
lang:ruby |
Searches only in Ruby files |
sym:process |
Finds "process" in symbols (methods, classes, variables) |
Pro Tip: For the most efficient searches, start specific and then broaden if needed. Using
file:
andlang:
filters dramatically increases relevance.
Stack multiple filters for precision:
is_expected file:rb -file:spec
This finds "is_expected" in Ruby files that don't have "spec" in their name.
Use regular expressions for powerful patterns:
token.*=.*[\"']
Watch this search performed against the GitLab Zoekt repository.
The search helps find hardcoded passwords, which, if not found, can be a security issue.
For more detailed syntax information, check the Exact Code Search documentation.
Exact Code Search is currently in Beta for GitLab.com users with Premium and Ultimate licenses:
For self-managed instances, we offer several deployment methods:
gitlab-zoekt
Helm chartSystem requirements depend on your codebase size, but the architecture is designed to scale horizontally and/or vertically as your needs grow.
While Exact Code Search is already powerful, we're continuously improving it:
Stay tuned for updates as we move from Beta to General Availability.
GitLab's Exact Code Search represents a fundamental rethinking of code discovery. By delivering exact matches, powerful regex support, and contextual results, it solves the most frustrating aspects of code search:
The impact extends beyond individual productivity:
Exact Code Search isn't just a feature, it's a better way to understand and work with code. Stop searching and start finding.
We'd love to hear from you! Share your experiences, questions, or feedback about Exact Code Search in our feedback issue. Your input helps us prioritize improvements and new features.
Ready to experience smarter code search? Learn more in our documentation or try it now by performing a search in your Premium or Ultimate licensed namespaces or projects. Not a GitLab user yet? Try a free, 60-day trial of GitLab Ultimate with Duo!