The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
Stage | AI-powered |
Group | AI Framework |
Maturity | Available |
Content Last Reviewed | 2026-06-11 |
The AI Research category identifies and explores AI/ML models to support use cases that other GitLab sections, stages, and groups are developing to enrich the DevSecOps experience for GitLab users.
We continuously evaluate AI/ML model vendors, open source models, and generative AI foundation models. Models showing promising results from our initial research undergo further testing via the AI Evaluation platform. We compare these models against those already supported in our AI Framework and actively powering GitLab Duo features, including self-hosted models.
We evaluate models using comprehensive criteria to support our enterprise customers' needs:
GitLab has built an advanced model evaluation platform called the Prompt Library. This platform contains thousands of human and synthetic generated prompts that we use to evaluate various AI models and model versions.
We use this suite of evaluations to run large-scale testing of AI model quality output against both human and synthetic generated benchmarks. Our evaluation framework incorporates multiple techniques, including but not limited to:
These represent just a sample of our evaluation methods. Teams across GitLab continuously develop and implement additional metrics and evaluation techniques tailored to their specific use cases and requirements. While no single technique is perfect, we combine multiple methods to comprehensively compare the quality of different models and versions.
This system enables GitLab to evaluate both new models and updated versions of existing models. We've successfully used this system to identify issues with model updates from our AI vendors. In some cases, it has detected model drift that vendors neither anticipated nor communicated to GitLab.
While our AI Evaluation suite currently provides point-in-time comparisons, we are developing automated testing capabilities. This will enable us to run the entire evaluation suite against models regularly, continuously detecting drift and model regressions.
The AI Research category will continue expanding its capabilities to:
Last Reviewed: 2025-06-11
Last Updated: 2025-06-11