AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

a39ab46bf619ada0e90ceed846648a81-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 07:00:29 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Robust Knowledge Transfer in Tiered RL

Neural Information Processing SystemsFeb-16-2026, 07:00:25 GMT

We identify a natural and necessary condition called the "Optimal

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

a3621ee907def47c1b952ade25c67698-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 06:37:55 GMT

This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents, and provides insight into their "cognitive" processes. To address the challenges of achieving autonomous cooperation, we propose a novel communicative agent framework named roleplaying .

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > Dominican Republic (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

Learning to Assist Humans without Inferring Rewards

Neural Information Processing SystemsFeb-16-2026, 06:36:51 GMT

Assistive agents should make humans' lives easier.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots (0.69)

Add feedback

A Proof Proof of Proposition 4.2 Proposition 4.2 The performance gap of evaluating policy profile (π, µ) and (π, π

Neural Information Processing SystemsFeb-16-2026, 06:36:44 GMT

Proof of Theorem 4.7 We first prove a Lemma. Theorem A.2. (Theorem 1 in [36]) Let ϵ = max Theorem 4.7 In a two-player game, suppose that According to Theorem A.2, we have J ( π, µ) J ( π, α) E CQL [20] puts regularization on the learning of Q function to penalize out-of-distribution actions. The CSP algorithm is illustrated in Algorithm 1. The proxy model is trained adversarially against our agent, therefore, we set the proxy's reward function to be the negative of our agent's reward. We show experiment details of the Maze example in this section.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

a31253f4871694f09541122d6b6f5ad1-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 06:36:41 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both

Neural Information Processing SystemsFeb-16-2026, 06:18:22 GMT

We argue that these properties are satisfied in many continuous state-action Markov decision processes.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

The Ladder in Chaos: Improving Policy Learning by Harnessing the Parameter Evolving Path in A Low-dimensional Space Hongyao Tang

Neural Information Processing SystemsFeb-16-2026, 06:04:18 GMT

Deep Reinforcement Learning (DRL) is far from well understood, although its great potential has been demonstrated with a lot of achievements in different practical problems [Badia et al., 2020, Shah et al., 2022, Fawzi et al., 2022, Degrave et al., 2022, OpenAI, 2022]. Consistent efforts are made to gain a better understanding of the learning dynamics of RL agents.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: