AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

Neural Information Processing SystemsAug-20-2025, 07:06:23 GMT

Humans make effective use of prior knowledge to acquire new skills rapidly.

international conference, multimodal task distribution, task distribution, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Michigan (0.04)
North America > Canada (0.04)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Xueying Bai, Jian Guan, Hongning Wang

Neural Information Processing SystemsAug-20-2025, 07:05:45 GMT

Neural Information Processing Systems http://nips.cc/

learning, offline data, recommendation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

the value of generative adversarial training for model-based reinforcement learning (RL) with offline data, especially

Neural Information Processing SystemsAug-20-2025, 07:05:30 GMT

First, we sincerely thank all reviewers for their thoughtful comments and suggestions. We will report the variance and statistical significance of our empirical results in our revision. These shed light on the approach's effectiveness as an online recommender. These two factors help control bias in value estimation for model-based RL. Please refer to Line 9-15 for our responses to possible new empirical evaluations.

adversarial training, estimation, offline data, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Add feedback

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

Harsh Gupta, R. Srikant, Lei Ying

Neural Information Processing SystemsAug-20-2025, 06:52:28 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, approximation, stochastic approximation algorithm, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

e354fd90b2d5c777bfec87a352a18976-AuthorFeedback.pdf

Neural Information Processing SystemsAug-20-2025, 06:52:12 GMT

algorithm, theoretical result, time-scale algorithm, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck

Maximilian Igl, Kamil Ciosek, Yingzhen Li, Sebastian Tschiatschek, Cheng Zhang, Sam Devlin, Katja Hofmann

Neural Information Processing SystemsAug-20-2025, 06:31:31 GMT

Neural Information Processing Systems http://nips.cc/

international conference, learning, regularization technique, (12 more...)

Neural Information Processing Systems

Country:

Europe > France (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Characterizing the Exact Behaviors of Temporal Difference Learning Algorithms Using Markov Jump Linear System Theory

Bin Hu, Usman Syed

Neural Information Processing SystemsAug-20-2025, 06:30:14 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

sponse addressing one common point raised by Reviewer 1 and Reviewer 3 regarding how to handle the case where 2 null

Neural Information Processing SystemsAug-20-2025, 06:29:57 GMT

We thank all the reviewers for their careful feedback and will revise our paper accordingly. Such a fact is presented in the classic paper "An analysis of temporal-difference learning with function Similar facts can be found for other TD algorithms (e.g. Reviewer 1 is correct in that a discount factor is needed. Now we address specific reviewer comments below. A reference for this is the classic paper "An Finally, the "-" sign in Line 213 is due to the Hurwtiz assumption.

assumption, reviewer 1, reviewer 3, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Disentangled behavioural representations

Amir Dezfouli, Hassan Ashtiani, Omar Ghattas, Richard Nock, Peter Dayan, Cheng Soon Ong

Neural Information Processing SystemsAug-20-2025, 06:08:20 GMT

Unfortunately, the parameter and latent activity spaces of RNNs are generally high-dimensional and uninterpretable, making it hard to use them to study individual differences.

probability, representation, sequence, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Ontario > Hamilton (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

A neurally plausible model learns successor representations in partially observable environments

Eszter Vértes, Maneesh Sahani

Neural Information Processing SystemsAug-20-2025, 05:53:52 GMT

However, it is not clear how such representations might be learned and computed in partially observed, noisy environments. Here, we introduce a neurally plausible model using distributional successor features, which builds on the distributed distributional code for the representation and computation of uncertainty, and which allows for efficient value function computation in partially observed environments via the successor representation.

representation, successor representation, value function, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)
Europe > United Kingdom (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback