A flaky test is an unreliable test that occasionally fails but passes eventually if you retry it enough times.
In a test suite, flaky tests are inevitable, so our goal should be to limit their negative impact as soon as possible.
||We don't know exactly what would be the success rate if we'd stop retrying flaky tests, but based on this exploratory chart, it could go down by approximately 7%|
|175 programmatically identified flaky tests and 211 `~"failure::flaky-test" issues out of a total of 159,590 tests||It means we identified 0.1% of tests as being flaky. This is in line with the "RSpec Job Flaky Failure Probability". GitHub identified that 25% of their tests were flaky at some point, our reality is probably in between.|
|Coverage is currently at 97.86%||Even if we'd removed the 175 flaky tests, we don't expect the coverage to go down meaningfully.|
|"Average Retry Count" per pipeline is currently at 0.08, it means given RSpec jobs' current average duration of 23 minutes, this results in an additional
||Given we have approximately 11k MR pipelines per month, that means flaky tests are wasting 20,240 minutes per month = 337 engineer hours = 14 days. Given our private runners cost us $0.0845 / minute, this means flaky tests are wasting $1,710 per month.|
When a flaky test fails in an MR, following is the workflow the author might follow:
Flaky tests negatively impact several teams and areas:
|Impacted department/team||Impacted area||Impact description||Impact quantification|
|Development department||MR & deployment cycle time||Wasted time (by forcing people to look at the failure and retry them manually)||~$26,000 wasted time per month based on 337 engineer hours and using $77 hourly rate for an Engineer|
|Infrastructure department||CI compute resources||Wasted money||At least $1,710 worth of wasted CI compute time per month|
|Delivery team & Quality department||Deployment cycle time||Distraction from actual CI failures & regressions, leading to slower detection of those||TBD|
masterstability to a solid 95% success rate without manual action
masteris broken or not and default action of retry