AITopics | Statistical Learning

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population. Within such a process, the update rules of "who to compete with" (i.e., the opponent mixture) and "how to beat them" (i.e., finding best responses) are underpinned by manually developed game theoretical principles such as fictitious play and Double Oracle. In this paper1, we introduce a novel framework--Neural Auto-Curricula (NAC)--that leverages meta-gradient descent to automate the discovery of the learning update rule without explicit human design. Specifically, we parameterise the opponent selection module by neural networks and the bestresponse module by optimisation subroutines, and update their parameters solely via interaction with the game engine, where both players aim to minimise their exploitability. Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance with the state-of-the-art population-based game solvers (e.g., PSRO) on Games of Skill, differentiable Lotto, non-transitive Mixture Games, Iterated Matching Pennies, and Kuhn Poker. Additionally, we show that NAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO on Leduc Poker. Our work inspires a promising future direction to discover general MARL algorithms solely from data.

arxiv preprint arxiv, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Appendix: Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM) AAnalysis of the DAMGate Function Dynamics During Training

Neural Information Processing SystemsApr-24-2026, 23:49:53 GMT

In this section, we theoretically analyze the dynamics of the DAM mask gi at the i-th layer as the training process unfolds. The loss function for training the neural network for the target task can then be denoted as L= L(f(x,Θ,βi)) (e.g., cross-entropy loss for supervised structured pruning problems and reconstruction error for representation learning problems), where xdenotes the input features to the neural network. Using gradient descent methods with a learning rate of η, the expected update formula of βi in DAM is given by: βi = ηEx Dtr [ βiL(f(x,Θ,βi)) + λ βiβi/(l 1)] (2) = ηEx Dtr [ βiL(f(x,Θ,βi))] ηλ/(l 1) (3) Let hi be the layer output before applying the DAM mask, and the masked output be represented as oi = hi gi after applying the gate. For the j-th neuron, gij/ βi = 0 if and only if ξj(βi)/ βi = 0. Since tanh(z) has non-zero gradients for z >0, the gradient of ξj(βi) is 0 only when kj/ni + βi 0, i.e., the mask value of the neuron is 0 (or in other words, it is deactivated or dead). Let us denote the set of all neuron indices with non-zero mask values (also referred to as active neurons) as J. Equation 4 can then be simplified as: βiL(f(x,Θ,βi)) = αi X We can make the following two observations: (i) only those neurons that are active (i.e., have non-zero mask values) have a contribution towards updating βi and moving the gate function. We name these neurons as support neurons and their position in the ordering of neurons as the transitioning zone of the gate function.

artificial intelligence, experiment, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

10cb15f4559b3d578b7f24966d48a137-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 23:48:57 GMT

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry:

Leisure & Entertainment > Games (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

1c336b8080f82bcc2cd2499b4c57261d-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 23:34:08 GMT

artificial intelligence, inv, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
North America > United States (0.27)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

Neural Information Processing SystemsApr-24-2026, 23:34:04 GMT

The invariance principle from causality is at the heart of notable approaches such as invariant risk minimization (IRM) that seek to address out-of-distribution (OOD) generalization failures. Despite the promising theory, invariance principle-based approaches fail in common classification tasks, where invariant (causal) features capture all the information about the label. Are these failures due to the methods failing to capture the invariance? Or is the invariance principle itself insufficient? To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD. In contrast to the linear regression tasks, we show that for linear classification tasks we need much stronger restrictions on the distribution shifts, or otherwise OOD generalization is impossible. Furthermore, even with appropriate restrictions on distribution shifts in place, we show that the invariance principle alone is insufficient. We prove that a form of the information bottleneck constraint along with invariance helps address key failures when invariant features capture all the information about the label and also retains the existing success when they do not. We propose an approach that incorporates both of these principles and demonstrate its effectiveness in several experiments.

artificial intelligence, invariant feature, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Locally private online change point detection

Neural Information Processing SystemsApr-24-2026, 23:33:25 GMT

We study online change point detection problems under the constraint of local differential privacy (LDP) where, in particular, the statistician does not have access to the raw data. As a concrete problem, we study a multivariate nonparametric regression problem. At each time point t, the raw data are assumed to be of the form (Xt,Yt), where Xt is a d-dimensional feature vector and Yt is a response variable. Our primary aim is to detect changes in the regression function mt(x) = E(Yt|Xt = x) as soon as the change occurs. We provide algorithms which respect the LDP constraint, which control the false alarm probability, and which detect changes with a minimal (minimax rate-optimal) delay. To quantify the cost of privacy, we also present the optimal rate in the benchmark, non-private setting. These non-private results are also new to the literature and thus are interesting per se. In addition, we study the univariate mean online change point detection problem, under privacy constraints. This serves as the blueprint of studying more complicated private change point detection problems.

artificial intelligence, machine learning, privacy, (14 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)

Add feedback

Adversarially Robust Learning with Uncertain Perturbation Sets

Neural Information Processing SystemsApr-24-2026, 23:33:18 GMT

In many real-world settings exact perturbation sets to be used by an adversary are not plausibly available to a learner. While prior literature has studied both scenarios with completely known and completely unknown perturbation sets, we propose an in-between setting of learning with respect to a class of perturbation sets. We show that in this setting we can improve on previous results with completely unknown perturbation sets, while still addressing the concerns of not having perfect knowledge of these sets in real life. In particular, we give the first positive results for the learnability of infinite Littlestone classes when having access to a perfect-attack oracle. We also consider a setting of learning with abstention, where predictions are considered robustness violations, only when the wrong label prediction is made within the perturbation set. We show there are classes for which perturbation-set unaware learning without query access is possible, but abstention is required.

artificial intelligence, machine learning, perturbation type, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback