test design
A Fast Binary Splitting Approach for Non-Adaptive Learning of Erdős--Rényi Graphs
We study the problem of learning an unknown graph via group queries on node subsets, where each query reports whether at least one edge is present among the queried nodes. In general, learning arbitrary graphs with $n$ nodes and $k$ edges is hard in the non-adaptive setting, requiring $Ω\big(\min\{k^2\log n,\,n^2\}\big)$ tests even when a small error probability is allowed. We focus on learning Erdős--Rényi (ER) graphs $G\sim\mathrm{ER}(n,q)$ in the non-adaptive setting, where the expected number of edges is $\bar{k}=q\binom{n}{2}$, and we aim to design an efficient testing--decoding scheme achieving asymptotically vanishing error probability. Prior work (Li--Fresacher--Scarlett, NeurIPS 2019) presents a testing--decoding scheme that attains an order-optimal number of tests $O(\bar{k}\log n)$ but incurs $Ω(n^2)$ decoding time, whereas their proposed sublinear-time algorithm incurs an extra $(\log \bar{k})(\log n)$ factor in the number of tests. We extend the binary splitting approach, recently developed for non-adaptive group testing, to the ER graph learning setting, and prove that the edge set can be recovered with high probability using $O(\bar{k}\log n)$ tests while attaining decoding time $O(\bar{k}^{1+δ}\log n)$ for any fixed $δ>0$.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","1871" "Title:","Parallel Feature Selection Inspired by Group Testing" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. In this paper a novel and interesting parallel feature selection framework based on group testing is proposed for large scale data. As the author claimed, the presented method can speed up the feature selection algorithm and provide superior performance than other existing methods especially on very high dimensional dataset. The proposed framework for parallel feature selection is well defined with sufficient theoretical analysis. The author has proved that KL divergence and MI is C-separable under certain conditions.
Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach
Ogbodo, Collins O., Rogers, Timothy J., Borgo, Mattia Dal, Wagg, David J.
Modal testing plays a critical role in structural analysis by providing essential insights into dynamic behaviour across a wide range of engineering industries. In practice, designing an effective modal test campaign involves complex experimental planning, comprising a series of interdependent decisions that significantly influence the final test outcome. Traditional approaches to test design are typically static-focusing only on global tests without accounting for evolving test campaign parameters or the impact of such changes on previously established decisions, such as sensor configurations, which have been found to significantly influence test outcomes. These rigid methodologies often compromise test accuracy and adaptability. To address these limitations, this study introduces an agent-based decision support framework for adaptive sensor placement across dynamically changing modal test environments. The framework formulates the problem using an underspecified partially observable Markov decision process, enabling the training of a generalist reinforcement learning agent through a dual-curriculum learning strategy. A detailed case study on a steel cantilever structure demonstrates the efficacy of the proposed method in optimising sensor locations across frequency segments, validating its robustness and real-world applicability in experimental settings.
Addressing Data Leakage in HumanEval Using Combinatorial Test Design
Bradbury, Jeremy S., More, Riddhi
The use of large language models (LLMs) is widespread across many domains, including Software Engineering, where they have been used to automate tasks such as program generation and test classification. As LLM-based methods continue to evolve, it is important that we define clear and robust methods that fairly evaluate performance. Benchmarks are a common approach to assess LLMs with respect to their ability to solve problem-specific tasks as well as assess different versions of an LLM to solve tasks over time. For example, the HumanEval benchmark is composed of 164 hand-crafted tasks and has become an important tool in assessing LLM-based program generation. However, a major barrier to a fair evaluation of LLMs using benchmarks like HumanEval is data contamination resulting from data leakage of benchmark tasks and solutions into the training data set. This barrier is compounded by the black-box nature of LLM training data which makes it difficult to even know if data leakage has occurred. To address the data leakage problem, we propose a new benchmark construction method where a benchmark is composed of template tasks that can be instantiated into new concrete tasks using combinatorial test design. Concrete tasks for the same template task must be different enough that data leakage has minimal impact and similar enough that the tasks are interchangeable with respect to performance evaluation. To assess our benchmark construction method, we propose HumanEval_T, an alternative benchmark to HumanEval that was constructed using template tasks and combinatorial test design.
Efficient and accurate group testing via Belief Propagation: an empirical study
AminCoja-Oghlan, null, Hahn-Klimroth, Max, Loick, Philipp, Penschuck, Manuel
The group testing problem asks for efficient pooling schemes and algorithms that allow to screen moderately large numbers of samples for rare infections. The goal is to accurately identify the infected samples while conducting the least possible number of tests. Exploring the use of techniques centred around the Belief Propagation message passing algorithm, we suggest a new test design that significantly increases the accuracy of the results. The new design comes with Belief Propagation as an efficient inference algorithm. Aiming for results on practical rather than asymptotic problem sizes, we conduct an experimental study.
As coronavirus spread in Wuhan, China's secret deals with businesses caused major testing blunders
WUHAN, China – In the early days in Wuhan, the first city first struck by the virus, getting a COVID-19 test was so difficult that residents compared it to winning the lottery. Throughout the Chinese city in January, thousands of people waited in hourslong lines for hospitals, sometimes next to corpses lying in hallways. But most couldn't get the test they needed to be admitted as patients. And for the few who did, the tests were often faulty, resulting in false negatives. The widespread test shortages and problems at a time when the virus could have been slowed were caused largely by secrecy and cronyism at China's top disease control agency, an Associated Press investigation has found. The flawed testing system prevented scientists and officials from seeing how fast the virus was spreading -- another way China fumbled its early response to the virus. Earlier reporting showed how top Chinese leaders delayed warning the public and withheld information from the World Health Organization, supplying the most comprehensive picture yet of China's initial missteps. Taken together, these mistakes in January facilitated the virus's spread through Wuhan and across the world undetected, in a pandemic that has now sickened more than 64 million people and killed almost 1.5 million.
Is AI all that it's cracked up to be for today's testing? - Software Testing News
Testing in the Third Industrial Revolution: Is AI all that it's cracked up to be for today's testing? This article seeks to make sense of the promise of AI in testing. It returns first to understand the pressure currently placed on testing by iterative software delivery, identifying some core requirements for testing in the "Third Industrial Revolution". It then considers why current testing methodologies currently fail to fulfil these objects. Only then are emerging technology from the world of AI considered. The goal is to identify how current AI technologies might remedy the challenges created by rise in automated test execution, building on the tools and techniques in place today.
Learning Erd\H{o}s-R\'enyi Random Graphs via Edge Detecting Queries
Li, Zihan, Fresacher, Matthias, Scarlett, Jonathan
In this paper, we consider the problem of learning an unknown graph via queries on groups of nodes, with the result indicating whether or not at least one edge is present among those nodes. We establish such bounds for a variety of algorithms inspired by the group testing problem, with explicit constant factors indicating a near-optimal number of tests, and in some cases asymptotic optimality including constant factors. I. INTRODUCTION Graphs are a ubiquitous tool in modern statistics and machine learning for depicting interactions, relations, and physical connections in networks, such as social networks, biological networks, sensor networks, and so on. Often, the graph is not known a priori, and must be learned via queries to the network. In this paper, we consider the problem of graph learning via edge detecting queries, where each query contains a subset of the nodes, and the binary outcome indicates whether or not there is at least one edge among these nodes. See Section IA for previous work on this problem. An application of this problem highlighted in previous works such as [15] is that of learning which chemicals react with each other, using tests that are able to detect whether any reaction occurs. Another potential application is learning connectivity in large wireless networks: Each node is given a unique identifier, and in response to a query, each node sends feedback to a central unit if both itself and one or more of its neigbors are included in that query.