Published on: December 10, 2024
5 min read
Find out how the GitLab team addressed more complex testing situations using GitLab Duo's AI capabilities, including ensuring that code testing followed standards.
The first part of our three-part series on test generation with GitLab Duo focused on how to automate code testing. Now, we will share the lessons we learned while using AI for test generation.
Overall, we were pleased with the results using GitLab Duo to generate tests on our code. As is the case with any language generation, some cases required minor adjustments such as fixing import paths or editing contents in datasets. For the more complex cases, we had to remember that AI solutions often lack context. Here's how we handled the more complex testing situations with GitLab Duo.
As is often the case when developing a software product, we encountered instances that required updates to existing tests. Rather than manually making adjustments to a full test suite for a common issue, we took full advantage of the GitLab Duo Chat window in VS Code. For example, to refactor tests, we used the Chat prompt “Please update the provided tests to use unittest rather than pytest” followed by pasting in the tests we wanted GitLab Duo to update.
Note: We copy-and-pasted GitLab Duo's recommendations into our code.
Creating tests for legacy code we knew worked was another challenging situation we encountered. In such circumstances, it was valuable to provide error snippets alongside failing tests and ask GitLab Duo to provide new tests. A full copy-and-paste from the terminal window of noted failures and errors to Chat, along with a request to “Please explain and fix this failing test” or similar prompts, yielded a summary of the issues the test was encountering as well as a new test addressing the problem. We did find this sometimes required multiple rounds of refactoring as new test failures were identified. However, the efficiency of GitLab Duo to provide various refactored solutions was fast and a net positive on team and developer efficiency.
In other instances, the modularization or complexity of our code led to variance in GitLab Duo’s results. For instance, when generating tests, GitLab Duo sometimes generated a series of passing and failing tests caused by differences in testing approach (e.g. usage of Mock and which objects were mocked). We provided GitLab Duo its own example of a passing test and asked it to modify individual tests one at a time to match the style of the passing tests to maintain consistency. We also would provide GitLab Duo a file of functioning tests for a similar object or task so it could mirror the structure.
While developing a Python module, GitLab Duo generated many tests using Mock and often they required refactoring, particularly around naming standardization. In such cases, we could leverage GitLab Duo Chat to refactor tests with instructions as to which specific test components to update. Prompting GitLab Duo for these changes was immensely faster than refactoring tests individually, as we had previousy done.
GitLab Duo generated tests for additional test cases the team had not previously considered, thus increasing coverage. Luckily, we could use GitLab Duo to quickly and efficiently address these edge cases and expand testing coverage, which is a key value-add for our team to build quickly and ensure a robust product.
Here are few key lessons that have been important to our success with GitLab Duo:
In our next article in this series, we’ll cover a test we ran to validate the impact of GitLab Duo on our team’s automated testing and discuss the impressive results we have achieved thus far.