AITopics | Education

Collaborating Authors

Education

Efficient online learning with Kernels for adversarial large scale problems

Neural Information Processing SystemsJun-2-2025, 00:46:48 GMT

We are interested in a framework of online learning with kernels for lowdimensional, but large-scale and potentially adversarial datasets. We study the computational and theoretical performance of online variations of kernel Ridge regression.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > France (0.14)

Industry: Education > Educational Setting > Online (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.70)

Add feedback

Experience Replay for Continual Learning

David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, Gregory Wayne

Neural Information Processing SystemsJun-2-2025, 00:43:54 GMT

Interacting with a complex world involves continual learning, in which tasks and data distributions change over time. A continual learning system should demonstrate both plasticity (acquisition of new knowledge) and stability (preservation of old knowledge). Catastrophic forgetting is the failure of stability, in which new experience overwrites previous experience. In the brain, replay of past experience is widely believed to reduce forgetting, yet it has been largely overlooked as a solution to forgetting in deep reinforcement learning. Here, we introduce CLEAR, a replay-based method that greatly reduces catastrophic forgetting in multi-task reinforcement learning. CLEAR leverages off-policy learning and behavioral cloning from replay to enhance stability, as well as on-policy learning to preserve plasticity. We show that CLEAR performs better than state-of-the-art deep learning techniques for mitigating forgetting, despite being significantly less complicated and not requiring any knowledge of the individual tasks being learned.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania (0.14)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

The Broad Optimality of Profile Maximum Likelihood

Yi Hao, Alon Orlitsky

Neural Information Processing SystemsJun-2-2025, 00:37:11 GMT

We study three fundamental statistical learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide range of learning tasks.

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.62)

Add feedback

DropEdge not Foolproof: Effective Augmentation Method for Signed Graph Neural Networks, Lu Li

Neural Information Processing SystemsJun-2-2025, 00:28:33 GMT

Signed graphs can model friendly or antagonistic relations where edges are annotated with a positive or negative sign. Signed Graph Neural Networks (SGNNs) have been widely used for signed graph representation learning. While significant progress has been made in SGNNs research, two issues (i.e., graph sparsity and unbalanced triangles) persist in the current SGNN models. We aim to alleviate these issues through data augmentation (DA) techniques which have demonstrated effectiveness in improving the performance of graph neural networks. However, most graph augmentation methods are primarily aimed at graph-level and node-level tasks (e.g., graph classification and node classification) and cannot be directly applied to signed graphs due to the lack of side information (e.g., node features and label information) in available real-world signed graph datasets. Random DropEdge is one of the few DA methods that can be directly used for signed graph data augmentation, but its effectiveness is still unknown.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (0.46)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Episodic Memory in Lifelong Language Learning

Cyprien de Masson d'Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama

Neural Information Processing SystemsJun-2-2025, 00:23:46 GMT

We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier. We propose an episodic memory model that performs sparse experience replay and local adaptation to mitigate catastrophic forgetting in this setup. Experiments on text classification and question answering demonstrate the complementary benefits of sparse experience replay and local adaptation to allow the model to continuously learn from new datasets. We also show that the space complexity of the episodic memory module can be reduced significantly ( 50-90%) by randomly choosing which examples to store in memory with a minimal decrease in performance. We consider an episodic memory component as a crucial building block of general linguistic intelligence and see our model as a first step in that direction.

latexit sha1, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.94)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Curriculum > Subject-Specific Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scripts & Frames (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Neural Information Processing SystemsJun-2-2025, 00:18:02 GMT

Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for training from scratch. This paper introduces Gated Slot Attention (GSA), which enhances Attention with Bounded-memory-Control (ABC [63]) by incorporating a gating mechanism inspired by Gated Linear Attention (GLA [96]). Essentially, GSA comprises a two-layer GLA linked via softmax, utilizing context-aware memory reading and adaptive forgetting to improve memory capacity while maintaining compact recurrent state size. This design greatly enhances both training and inference efficiency through GLA's hardware-efficient training algorithm and reduced state size. Additionally, retaining the softmax operation is particularly beneficial in "finetuning pretrained Transformers to RNNs" (T2R [41]) settings, reducing the need for extensive training from scratch. Extensive experiments confirm GSA's superior performance in scenarios requiring in-context recall and in T2R settings.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.92)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.92)
Health & Medicine > Consumer Health (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Strategic Linear Contextual Bandits

Neural Information Processing SystemsJun-2-2025, 00:03:45 GMT

Motivated by the phenomenon of strategic agents gaming a recommender system to maximize the number of times they are recommended to users, we study a strategic variant of the linear contextual bandit problem, where the arms can strategically misreport privately observed contexts to the learner. We treat the algorithm design problem as one of mechanism design under uncertainty and propose the Optimistic Grim Trigger Mechanism (OptGTM) that incentivizes the agents (i.e., arms) to report their contexts truthfully while simultaneously minimizing regret. We also show that failing to account for the strategic nature of the agents results in linear regret. However, a trade-off between mechanism design and regret minimization appears to be unavoidable. More broadly, this work aims to provide insight into the intersection of online learning and mechanism design.

artificial intelligence, big data, data mining, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report > Experimental Study (0.92)

Industry:

Information Technology (0.46)
Government (0.45)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

Iterative Reasoning Preference Optimization Richard Yuanzhe Pang 1,2 Weizhe Yuan 1,2 He He

Neural Information Processing SystemsJun-2-2025, 00:03:26 GMT

Iterative preference optimization methods have recently been shown to perform well for general instruction tuning tasks, but typically make little improvement on reasoning tasks [Yuan et al., 2024, Chen et al., 2024]. In this work we develop an iterative approach that optimizes the preference between competing generated Chain-of-Thought (CoT) candidates by optimizing for winning vs. losing reasoning steps. We train using a modified DPO loss [Rafailov et al., 2023] with an additional negative log-likelihood term, which we find to be crucial. We show reasoning improves across repeated iterations of this scheme. While only relying on examples in the training set, our approach results in increasing accuracy on GSM8K, MATH, and ARC-Challenge for Llama-2-70B-Chat, outperforming other Llama-2-based models not relying on additionally sourced datasets. For example, we see a large improvement from 55.6% to 81.6% on GSM8K and an accuracy of 88.7% with majority voting out of 32 samples.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Asia > Thailand (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Nuclear Norm Regularization for Deep Learning

Neural Information Processing SystemsJun-1-2025, 23:42:25 GMT

Penalizing the nuclear norm of a function's Jacobian encourages it to locally behave like a low-rank linear map. Such functions vary locally along only a handful of directions, making the Jacobian nuclear norm a natural regularizer for machine learning problems. However, this regularizer is intractable for high-dimensional problems, as it requires computing a large Jacobian matrix and taking its SVD. We show how to efficiently penalize the Jacobian nuclear norm using techniques tailormade for deep learning. We prove that for functions parametrized as compositions f = g h, one may equivalently penalize the average squared Frobenius norms of Jg and Jh. We then propose a denoising-style approximation that avoids Jacobian computations altogether. Our method is simple, efficient, and accurate, enabling Jacobian nuclear norm regularization to scale to high-dimensional deep learning problems. We complement our theory with an empirical study of our regularizer's performance and investigate applications to denoising and representation learning.

artificial intelligence, machine learning, singular value, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.67)
Health & Medicine (0.45)
Education > Focused Education > Special Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games

Maximilian Balandat, Walid Krichene, Claire Tomlin, Alexandre Bayen

Neural Information Processing SystemsJun-1-2025, 23:41:49 GMT

We study a general adversarial online learning problem, in which we are given a decision set X in a reflexive Banach space X and a sequence of reward vectors in the dual space of X. At each iteration, we choose an action from X, based on the observed sequence of previous rewards. Our goal is to minimize regret. Using results from infinite dimensional convex analysis, we generalize the method of Dual Averaging to our setting and obtain upper bounds on the worst-case regret that generalize many previous results. Under the assumption of uniformly continuous rewards, we obtain explicit regret bounds in a setting where the decision set is the set of probability distributions on a compact metric space S. Importantly, we make no convexity assumptions on either S or the reward functions. We also prove a general lower bound on the worst-case regret for any online algorithm. We then apply these results to the problem of learning in repeated two-player zero-sum games on compact metric spaces. In doing so, we first prove that if both players play a Hannan-consistent strategy, then with probability 1 the empirical distributions of play weakly converge to the set of Nash equilibria of the game. We then show that, under mild assumptions, Dual Averaging on the (infinite-dimensional) space of probability distributions indeed achieves Hannan-consistency.

artificial intelligence, machine learning, sequence, (16 more...)

Neural Information Processing Systems

Country: