At GitLab, we use different strategies to make assessments about the stability or robustness of a feature by means of best practices such as staging environments, feature flags, or canary testing. We also use testing strategies such as A/B testing to assess how users react to feature variants.
However, our short release cycles require testing and benchmarking approaches that make it possible to prototype, test, and benchmark ideas quickly (ideally while developing them). We need an approach that works on large code bases, can help assess a feature before deployment to staging or production, and provides data to support data-driven decision making.
To address this need, we developed the SourceWarp tool: a record-and-replay framework for source code management systems. In this blog post, we will explain our motivation for creating SourceWarp and explain how we use it to inform data-driven decision making within the GitLab platform.
Motivation: Data-driven decision making in the DevSecOps context
DevSecOps streamlines software development by allowing teams to ship features quickly and providing short feedback cycles for customers. These short feedback cycles can be used to monitor the impact of a feature from the time it is shipped and inform developers and product managers about the success or failure of a given deployment.
GitLab, as a heterogeneous DevSecOps platform, acts as an integration point for different CI/CD tools that often contribute to user-facing functionality. For example, the vulnerability report, which displays all detected vulnerabilities, is visible as a single functionality, but the data in the report may come from a number of different tools in various pipelines. The DevSecOps platform collects and stores results in the backend database and keeps track of user actions on the findings (through the UI or the API). A large portion of the automation in the platform is built around or initiated by code changes where the source code management system or Git respoitory basically holds the input data. In order to test and benchmark new features for these systems effectively, the testing and benchmarking approach needs to have some source code awareness.
We can use SourceWarp to achieve this. Let's dive in to a real-world example of how we used SourceWarp to help make an informed decision about a product integration.
Case study: Advanced vulnerability tracking
As a DevSecOps platform, GitLab provides automation centered around code changes, where the source code is stored in a source code management system. SourceWarp uses a Git repository as input, which we use to source test-input data to test and benchmark our newly developed feature.
In a record phase, SourceWarp extracts commits from the source history that are relevant with respect to a given test criterion and generates a patch replay sequence. In the monitor phase, SourceWarp replays the generated sequence on a target system. These phases are executed while continuously monitoring the DevSecOps platform to collect metrics and to generate a report that provides the testing and benchmarking results.
We used SourceWarp to test and benchmark advanced vulnerability tracking, which identifies and deduplicates vulnerabilities in a changing code base. In our benchmarking and testing experiment, we let SourceWarp automatically sample patch sequences from a slice of GitLab's source code repository history (2020-10-31 and 2020-12-31) and replay them on two target systems: One system had advanced vulnerability tracking enabled, and the other one was using our old vulnerability tracking approach.
After the application of every patch from the
patch sequence, SourceWarp collected metrics from the target system that
recorded the observed vulnerabilities. We observed that our vulnerability
tracking approach was 30% more effective than traditional
vulnerability tracking where <file, line number>
are used to identify the
location of a vulnerabilty. This means that advanced vulnerabiilty tracking
reduces the manual effort of auditing vulnerabilities by 30%.
In addition, we observed that with an increasing number of source code changes, the deduplication effectiveness of vulnerability tracking increases. Looking at the relatively short timeframe from 2020-10-31 to 2020-12-31, the deduplication effectivness increased from 11% to 30%, which suggests that the effectiveness increases over time as the source code evolves.
SourceWarp performed this experiment in an automated and reproducible way, and provided data that was helpful in making an informed decision about the product integration of vulnerability tracking.
Where to find more SourceWarp information
The SourceWarp approach is detailed in our research paper, "SourceWarp: A scalable, SCM-driven testing and benchmarking approach to support data-driven and agile decision making for CI/CD tools and DevSecOps platforms," which will be presented at the 4th ACM/IEEE International Conference on Automation of Software Test (AST 2023).
The SourceWarp testing and benchmarking tool is implemented in Ruby and is open source (MIT license).
The README.md
provides information about the tool setup and implementation.
You can also see it in action in the demo below.
Useful Links
Cover image by Jason Corey on Unsplash