AITopics | Statistical Learning

Two-Stage Learning to Defer with Multiple Experts

Neural Information Processing SystemsApr-24-2026, 15:50:02 GMT

We study a two-stage scenario for learning to defer with multiple experts, which is crucial in practice for many applications. In this scenario, a predictor is derived in a first stage by training with a common loss function such as cross-entropy. In the second stage, a deferral function is learned to assign the most suitable expert to each input. We design a new family of surrogate loss functions for this scenario both in the score-based and the predictor-rejector settings and prove that they are supported by H-consistency bounds, which implies their Bayes-consistency. Moreover, we show that, for a constant cost function, our two-stage surrogate losses are realizable H-consistent. While the main focus of this work is a theoretical analysis, we also report the results of several experiments on CIFAR-10 and SVHN datasets.

artificial intelligence, machine learning, surrogate loss, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Differentiable Unsupervised Feature Selection based on a Gated Laplacian - Supplementary Materials

Neural Information Processing SystemsApr-24-2026, 15:48:22 GMT

It is important to properly tune the kernel scale/bandwidth σb, which determines its scale of connectivity. Several studies have proposed schemes for tuning σb, see for example [10, 3, 12, 5]. Here, we focus on two schemes, a global bandwidth and a local bandwidth. The local bandwidth proposed in [12], involves setting a local-scale σi for each data point xi,i= 1,...,n. The scale is chosen using the L1 distance from the k-th nearest neighbor of the point xi.

artificial intelligence, dataset, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Differentiable Unsupervised Feature Selection based on a Gated Laplacian

Neural Information Processing SystemsApr-24-2026, 15:48:18 GMT

Scientific observations may consist of a large number of variables (features). Selecting a subset of meaningful features is often crucial for identifying patterns hidden in the ambient space. In this paper, we present a method for unsupervised feature selection, and we demonstrate its advantage in clustering, a common unsupervised task. We propose a differentiable loss that combines a graph Laplacian-based score that favors low-frequency features with a gating mechanism for removing nuisance features. Our method improves upon the naive graph Laplacian score by replacing it with a gated variant computed on a subset of low-frequency features. We identify this subset by learning the parameters of continuously relaxed Bernoulli variables, which gate the entire feature space. We mathematically motivate the proposed approach and demonstrate that it is crucial to compute the graph Laplacian on the gated inputs rather than on the full feature space in the high noise regime. Using several real-world examples, we demonstrate the efficacy and advantage of the proposed approach over leading baselines.

artificial intelligence, laplacian, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

AReduction to Binary Approach for Debiasing Multiclass Datasets

Neural Information Processing SystemsApr-24-2026, 15:33:13 GMT

We propose a novel reduction-to-binary (R2B) approach that enforces demographic parity for multiclass classification with non-binary sensitive attributes via a reduction to a sequence of binary debiasing tasks. We prove that R2B satisfies optimality and bias guarantees and demonstrate empirically that it can lead to an improvement over two baselines: (1) treating multiclass problems as multi-label by debiasing labels independently and (2) transforming the features instead of the labels. Surprisingly, we also demonstrate that independent label debiasing yields competitive results in most (but not all) settings.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

10eaa0aae94b34308e9b3fa7b677cbe1-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:33:10 GMT

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Uniform Sampling over Episode Difficulty

Neural Information Processing SystemsApr-24-2026, 15:32:59 GMT

Episodic training is a core ingredient of few-shot learning to train models on tasks with limited labelled data. Despite its success, episodic training remains largely understudied, prompting us to ask the question: what is the best way to sample episodes? In this paper, we first propose a method to approximate episode sampling distributions based on their difficulty. Building on this method, we perform an extensive analysis and find that sampling uniformly over episode difficulty outperforms other sampling schemes, including curriculum and easy-/hard-mining. As the proposed sampling method is algorithm agnostic, we can leverage these insights to improve few-shot learning accuracies across many episodic training algorithms. We demonstrate the efficacy of our method across popular few-shot learning datasets, algorithms, network architectures, and protocols.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.46)
North America > United States > New York (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Media > Television (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

10a6bdcabbd5a3d36b760daa295f63c1-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:32:28 GMT

machine learning, natural language, reinforcement learning, (15 more...)

Neural Information Processing Systems

Industry:

Leisure & Entertainment > Games (0.68)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

109cf25cbc36037deecdbeabfa199956-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:32:13 GMT

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report (0.92)

Industry:

Leisure & Entertainment (0.92)
Information Technology (0.67)
Media > Film (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

0b0d29e5d5c8a7a25dced6405bd022a9-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 15:31:59 GMT

We introduce regularized Frank-Wolfe, a general and effective algorithm for inference and learning of dense conditional random fields (CRFs). The algorithm optimizes a nonconvex continuous relaxation of the CRF inference problem using vanilla Frank-Wolfe with approximate updates, which are equivalent to minimizing a regularized energy function. Our proposed method is a generalization of existing algorithms such as mean field or concave-convex procedure. This perspective not only offers a unified analysis of these algorithms, but also allows an easy way of exploring different variants that potentially yield better performance. We illustrate this in our empirical results on standard semantic segmentation datasets, where several instantiations of our regularized Frank-Wolfe outperform mean field inference, both as a standalone component and as an end-to-end trainable layer in a neural network. We also show that dense CRFs, coupled with our new algorithms, produce significant improvements over strong CNN baselines.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

GlucoSynth: Generating Differentially-Private Synthetic Glucose Traces Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsApr-24-2026, 15:31:54 GMT

We focus on the problem of generating high-quality, private synthetic glucose1 traces, a task generalizable to many other time series sources. Existing methods for2 time series data synthesis, such as those using Generative Adversarial Networks3 (GANs), are not able to capture the innate characteristics of glucose data and cannot4 provide any formal privacy guarantees without severely degrading the utility of the5 synthetic data. In this paper we present GlucoSynth, a novel privacy-preserving6 GAN framework to generate synthetic glucose traces. The core intuition behind our7 approach is to conserve relationships amongst motifs (glucose events) within the8 traces, in addition to temporal dynamics. Our framework incorporates differential9 privacy mechanisms to provide strong formal privacy guarantees.

artificial intelligence, machine learning, motif, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry: