Goto

Collaborating Authors

 unlabeled test data


TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

Neural Information Processing Systems

Deep learning (DL) systems are notoriously difficult to test and debug due to the lack of correctness proof and the huge test input space to cover. Given the ubiquitous unlabeled test data and high labeling cost, in this paper, we propose a novel test prioritization technique, namely TestRank, which aims at revealing more model failures with less labeling effort. TestRank brings order into the unlabeled test data according to their likelihood of being a failure, i.e., their failure-revealing capabilities.


From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging

Wu, Jialin, Yang, Jian, Wang, Handing, Wen, Jiajun, Yu, Zhiyong

arXiv.org Artificial Intelligence

Model merging combines expert models for multitask performance but faces challenges from parameter interference. This has sparked recent interest in controllable model merging, giving users the ability to explicitly balance performance trade-offs. Existing approaches employ a compile-then-query paradigm, performing a costly offline multi-objective optimization to enable fast, preference-aware model generation. This offline stage typically involves iterative search or dedicated training, with complexity that grows exponentially with the number of tasks. To overcome these limitations, we shift the perspective from parameter-space optimization to a direct correction of the model's final representation. Our approach models this correction as an optimal linear transformation, yielding a closed-form solution that replaces the entire offline optimization process with a single-step, architecture-agnostic computation. This solution directly incorporates user preferences, allowing a Pareto-optimal model to be generated on-the-fly with complexity that scales linearly with the number of tasks. Experimental results show our method generates a superior Pareto front with more precise preference alignment and drastically reduced computational cost.


TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

Neural Information Processing Systems

Deep learning (DL) systems are notoriously difficult to test and debug due to the lack of correctness proof and the huge test input space to cover. Given the ubiquitous unlabeled test data and high labeling cost, in this paper, we propose a novel test prioritization technique, namely TestRank, which aims at revealing more model failures with less labeling effort. TestRank brings order into the unlabeled test data according to their likelihood of being a failure, i.e., their failure-revealing capabilities. To be specific, we first build a similarity graph on both unlabeled test samples and labeled samples (e.g., training or previously labeled test samples). Then, we conduct graph-based semi-supervised learning to extract contextual features from the correctness of similar labeled samples.


A Collective Learning Framework to Boost GNN Expressiveness

Hang, Mengyue, Neville, Jennifer, Ribeiro, Bruno

arXiv.org Machine Learning

Graph Neural Networks (GNNs) have recently been used for node and graph classification tasks with great success, but GNNs model dependencies among the attributes of nearby neighboring nodes rather than dependencies among observed node labels. In this work, we consider the task of inductive node classification using GNNs in supervised and semi-supervised settings, with the goal of incorporating label dependencies. Because current GNNs are not universal (i.e., most-expressive) graph representations, we propose a general collective learning approach to increase the representation power of any existing GNN. Our framework combines ideas from collective classification with self-supervised learning, and uses a Monte Carlo approach to sampling embeddings for inductive learning across graphs. We evaluate performance on five real-world network datasets and demonstrate consistent, significant improvement in node classification accuracy, for a variety of state-of-the-art GNNs.


Model Specification Test with Unlabeled Data: Approach from Covariate Shift

Kato, Masahiro, Kawarazaki, Hikaru

arXiv.org Machine Learning

We propose a novel framework of the model specification test in regression using unlabeled test data. In many cases, we have conducted statistical inferences based on the assumption that we can correctly specify a model. However, it is difficult to confirm whether a model is correctly specified. To overcome this problem, existing works have devised statistical tests for model specification. Existing works have defined a correctly specified model in regression as a model with zero conditional mean of the error term over train data only. Extending the definition in conventional statistical tests, we define a correctly specified model as a model with zero conditional mean of the error term over any distribution of the explanatory variable. This definition is a natural consequence of the orthogonality of the explanatory variable and the error term. If a model does not satisfy this condition, the model might lack robustness with regards to the distribution shift. The proposed method would enable us to reject a misspecified model under our definition. By applying the proposed method, we can obtain a model that predicts the label for the unlabeled test data well without losing the interpretability of the model. In experiments, we show how the proposed method works for synthetic and real-world datasets.