To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects (ICST 2022 - Research Papers)

Who

Saikat Dutta, Anshul Arunachalam, Sasa Misailovic

Track

ICST 2022 Research Papers

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 6 Apr 2022 19:45 - 20:00 at Margaret Hamilton - ICST Automated Testing 2 Chair(s): José Campos

Abstract

Machine Learning (ML) algorithms are inherently random in nature – executing them using the same inputs may lead to slightly different results across different runs. Such randomness makes it challenging for developers to write tests for their implementations of ML algorithms. A natural consequence of randomness is test flakiness – tests both pass and fail non- deterministically for same version of code.

Developers often choose to alleviate test flakiness in ML projects by setting seeds in the random number generators used by the code under test. However, this approach only best serves as a “workaround”. Instead of fixing the test, it makes the test more brittle and can potentially cause developers to miss bugs which may be only triggered by a different sequence of computations.

We conduct the first large-scale empirical study of the usage of seeds and its implications on testing on a corpus of 114 Machine Learning projects. We identify 461 tests in these projects which fail without seeds and study their nature and root causes. We also try to minimize the flakiness of a subset of 42 identified tests using alternative strategies such as tuning algorithm hyper-parameters and adjusting assertion bounds and send them to developers. So far, developers have accepted our fixes for 19 tests.

We also manually analyze a subset of 56 tests and study various characteristics such as the nature of test oracles and how the seed settings evolve over time. Finally, we provide several important insights and their implications for both researchers and developers in the context of setting seeds in tests.

Saikat Dutta

University of Illinois at Urbana-Champaign

United States

Anshul Arunachalam

University of Illinois at Urbana-Champaign

United States

Sasa Misailovic

University of Illinois at Urbana-Champaign

United States

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 6 Apr
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

19:30 - 20:45	ICST Automated Testing 2Research Papers / Journal-First Papers at Margaret Hamilton Chair(s): José Campos University of Lisbon, Portugal

19:30 15m Talk		Providing Real-time Assistance for Repairing Runtime Exceptions using Stack Overflow Posts Research Papers Sonal Mahajan Uber Technologies Inc., Mukul Prasad Fujitsu Research of America
19:45 15m Talk		To Seed or Not to Seed? An Empirical Analysis of Usage of Seeds for Testing in Machine Learning Projects Research Papers Saikat Dutta University of Illinois at Urbana-Champaign, Anshul Arunachalam University of Illinois at Urbana-Champaign, Sasa Misailovic University of Illinois at Urbana-Champaign
20:00 15m Talk		Integration testing for robotic systems Journal-First Papers Maria Brito Federal University of Lavras Lavras, Simone do Rocio Senger de Souza ICMC/USP, Paulo Sergio Lopes de Souza ICMC/USP Link to publication DOI
20:15 15m Talk		Patterns of Code-to-Test Co-evolution for Automated Test Suite Maintenance Research Papers Samiha Shimmi Northern Illinois University, Mona Rahimi Northern Illinois University
20:30 15m Live Q&A		Discussion and Q&A Research Papers