Can Search-Based Testing with Pareto Optimization Effectively Cover Failure-Revealing Test Inputs?
Sorokin, Lev, Safin, Damir, Nejati, Shiva
–arXiv.org Artificial Intelligence
Search-based software testing (SBST) is a widely adopted technique for testing complex systems with large input spaces, such as Deep Learning-enabled (DL-enabled) systems. Many SBST techniques focus on Pareto-based optimization, where multiple objectives are optimized in parallel to reveal failures. However, it is important to ensure that identified failures are spread throughout the entire failure-inducing area of a search domain and not clustered in a sub-region. This ensures that identified failures are semantically diverse and reveal a wide range of underlying causes. In this paper, we present a theoretical argument explaining why testing based on Pareto optimization is inadequate for covering failure-inducing areas within a search domain. We support our argument with empirical results obtained by applying two widely used types of Pareto-based optimization techniques, namely NSGA-II (an evolutionary algorithm) and OMOPSO (a swarm-based Pareto-optimization algorithm), to two DL-enabled systems: an industrial Automated Valet Parking (AVP) system and a system for classifying handwritten digits. We measure the coverage of failure-revealing test inputs in the input space using a metric that we refer to as the Coverage Inverted Distance quality indicator. Our results show that NSGA-II-based search and OMOPSO are not more effective than a na\"ive random search baseline in covering test inputs that reveal failures. The replication package for this study is available in a GitHub repository.
arXiv.org Artificial Intelligence
Oct-16-2024
- Country:
- Asia > Russia (0.04)
- North America
- United States > New York
- New York County > New York City (0.04)
- Canada > Ontario
- National Capital Region > Ottawa (0.04)
- United States > New York
- Europe
- Russia > Northwestern Federal District
- Leningrad Oblast > Saint Petersburg (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Russia > Northwestern Federal District
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Automobiles & Trucks (0.92)
- Transportation > Ground
- Road (0.92)
- Technology: