The following page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features or functionality remain at the sole discretion of GitLab Inc.
The Data Science Section focuses on transforming data into useful insights and actions.
The Data Science section is comprised of two stages:
GitLab's Data Science section was introduced in late 2021 and has grown and laid the foundation of data science at GitLab throughout 2022 and 2023 along with the introduction of GitLab Duo, our suite of AI-Powered capabilities. In 2024 we continued to invest heavily in Data Science use cases across the platform and build features to enable our customers to more effectively and efficiently build software using GitLab Duo. We're also developing new MLOps capabilities that help customers build ML/AI into their software with our ModelOps stage.
To learn more about GitLabβs investment areas, please visit the Product Investments section of the GitLab Handbook.
This section aligns cross-functional teams and organizational structures across Product, Engineering, UX, and technical writing teams. This streamlines the management chain of all individuals across functions as well as aligns unique product development areas of focus and challenges. Both the ModelOps and AI-powered stages share some unique properties that other Gitlab sections/stages do not:
We've established a Data Science internal handbook PI page (internal link) which will be updated monthly as part of PI review meetings. We're still working to actively orchestrate all our performance indicator metrics.
With complex toolchains and new vendors emerging every day the data science landscape is a lot of glue and ducktape holding many systems together. We want to streamline this complexity into the GitLab platform to reduce complexity, remove maintenance burden, and enable faster model development and exploration.
As examples, GitLab will provide:
Many data science teams struggle with lack of repeatability cobbling together environments on local machines. These environments rarely have source code management or CI. We want to bring the best practices of DevOps with SCM and CI/CD to data sciences and make it easy for them to start with repeatable and stable environments.
As examples, GitLab will provide:
Model handoffs are only one part of the collaboration needed to make data science handoffs smooth. We want to create seamless handoffs across the software development lifecycle of data science workloads, from connecting data to pipelines, managing model code, and the deployment to production. GitLab already is critical for modern software developers managing production applications. We'll bring the best of our existing DevOps platform to data scientists.
As examples, GitLab will provide:
Long gone are the days of static data. Data today is in motion. It's always being created, moved, transformed, and drifting. It's in the cloud and sometimes many clouds. Modern data science toolchains need to support cloud-native, data in motion.
As examples, GitLab will provide:
We expect the Data Science section will provide multiple monetization strategies across all GitLab plans with features targeted for data science use cases and Insider Threat detection capabilities. These paid features will follow GitLab's pricing themes to determine how to package various features we develop.
AI-Powered GitLab Duo features will be priced with additional addon pricing due to the material ongoing costs to deliver these functionality with paid API calls to AI vendors (like Google Vertex AI and Anthropic) who provide powerful and state of the art Large Language Models (LLMs). Usage of GitLab Duo capabilities generate millions of LLM API requests and process billions of input and output tokens from these AI vendors. Learn more about GitLab Duo
Data Science aims to make GitLab smarter and more automated using ML. Features we develop will help organizations automate their portfolio management, improve their security posture, and detect Insider Threats.
As a general rule of thumb, features will fall in the Ultimate tier when they meet one or more of the following criteria:
Some examples include:
Features targeted at premium will include a focus on enabling data science use cases across existing GitLab features like source code management (SCM), CI/CD as well as help protect precious intellectual property like source code hosted within GitLab. We want GitLab natively to support data science workloads and much of the value of managing workloads is found in the premium tier which ModelOps will seek to enhance.
Although paid features are the primary focus, there are several reasons why features for unpaid tiers might be prioritized above paid features:
As a general rule of thumb, features will fall in the Core/Free tier when they meet one or more of the following criteria:
Some examples include:
GitLab identifies who our DevSecOps application is built for utilizing the following categorization. We list our view of who we will support when in priority order:
To capitalize on the potential opportunities, the AI-Powered and ModelOps Stages have features that make it useful to the following personas today:
As we execute our 3 year strategy, our medium-term (1-2 year) goal is to provide a single DevSecOps application that enables collaboration between developers, data teams, data scientists, and engineers across organizations.
Data Science workloads can be complicated and can leverage specialized hardware and development environments not common to traditional software development teams. The ModelOps stage is focused on the intersection of data scientists exploring models and feature development and the developers who must then deploy those data science features into production.
Personas
Data scientists have unique roles within organizations. They are more scientists than developers, following hypotheses and data to explore models and develop data science-powered features.
We aim to serve data scientists as they balance art and science within software engineering teams. Data scientists wear a lot of hats to get from hypothesis to data science feature that generates value. GitLab is not a tool of choice for data scientists and we aim to change that by making it easy to configure, build, and execute data science feature development within GitLab.
Personas
The larger the organization, the harder it is for security teams to stay on top of everything happening in complex, ever-changing environments. As an organization's source code management and DevSecOps platform, GitLab holds a lot of sensitive, high-value data. We want to help security teams secure that data. This is a job to which automated data science features can be well suited, including monitoring high-value assets around the clock.
Personas
Last Reviewed: 2024-10-05
Last Updated: 2024-10-05