To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects
Machine Learning (ML) algorithms are inherently random in nature – executing them using the same inputs may lead to slightly different results across different runs. Such randomness makes it challenging for developers to write tests for their implementations of ML algorithms. A natural consequence of randomness is test flakiness – tests both pass and fail non- deterministically for same version of code.
Developers often choose to alleviate test flakiness in ML projects by setting seeds in the random number generators used by the code under test. However, this approach only best serves as a “workaround”. Instead of fixing the test, it makes the test more brittle and can potentially cause developers to miss bugs which may be only triggered by a different sequence of computations.
We conduct the first large-scale empirical study of the usage of seeds and its implications on testing on a corpus of 114 Machine Learning projects. We identify 461 tests in these projects which fail without seeds and study their nature and root causes. We also try to minimize the flakiness of a subset of 42 identified tests using alternative strategies such as tuning algorithm hyper-parameters and adjusting assertion bounds and send them to developers. So far, developers have accepted our fixes for 19 tests.
We also manually analyze a subset of 56 tests and study various characteristics such as the nature of test oracles and how the seed settings evolve over time. Finally, we provide several important insights and their implications for both researchers and developers in the context of setting seeds in tests.
Wed 6 AprDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
19:30 - 20:45
|Providing Real-time Assistance for Repairing Runtime Exceptions using Stack Overflow Posts|
|To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects|
|Integration testing for robotic systems|
Maria Brito Federal University of Lavras Lavras, Simone do Rocio Senger de Souza ICMC/USP, Paulo Sergio Lopes de Souza ICMC/USPLink to publication DOI
|Patterns of Code-to-Test Co-evolution for Automated Test Suite Maintenance|
|Discussion and Q&A|