AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

6dc02cf4905e873ca6fd0dfc7907e230-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 16:31:32 GMT

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre: Research Report (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement Benjamin Eysenbach

Neural Information Processing SystemsAug-15-2025, 16:31:18 GMT

Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency.

inverse rl, reward function, trajectory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi Ho Chit Siu Jaime D. Peña Yutai Zhou Edenna Chen Victor J. Lopez Kyle Palko Kimberlee C. Chang Ross E. Allen

Neural Information Processing SystemsAug-15-2025, 16:18:51 GMT

Hanabi is a cooperative card game in which two to five players attempt to stack twenty-five cards into five different fireworks (piles), one for each suit (color) and by ascending rank (number).

agent, participant, teammate, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (0.69)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

Learning Diverse Policies in MOBA Games via Macro-Goals

Neural Information Processing SystemsAug-15-2025, 16:17:33 GMT

Recently, many researchers have made successful progress in building the AI systems for MOBA-game-playing with deep reinforcement learning, such as on Dota 2 and Honor of Kings .

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

6ceb6c2150bbf46fd75528a6cd6be793-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 15:58:47 GMT

batch size, behavior policy, proximal policy, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.05)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Neural Information Processing SystemsAug-15-2025, 15:38:56 GMT

Actor-critic (AC) algorithms, empowered by neural networks, have had significant empirical success in recent years.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

85a4413ecea7122bcc399cf0a53bba26-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 15:38:53 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

On the Sample Complexity of Stabilizing LTI Systems on a Single Trajectory

Neural Information Processing SystemsAug-15-2025, 15:38:42 GMT

Stabilizing an unknown dynamical system is one of the central problems in control theory. In this paper, we study the sample complexity of the learn-to-stabilize problem in Linear Time-Invariant (L TI) systems on a single trajectory. Current state-of-the-art approaches require a sample complexity linear in n, the state dimension, which incurs a state norm that blows up exponentially in n. We propose a novel algorithm based on spectral decomposition that only needs to learn "a small part" of the dynamical matrix acting on its unstable subspace. We show that, under proper assumptions, our algorithm stabilizes an L TI system on a single trajectory with O ( k log n) samples, where k is the instability index of the system. This represents the first sub-linear sample complexity result for the stabilization of L TI systems under the regime when k = o (n).

arxiv preprint arxiv, controller, matrix, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New Jersey (0.04)
(3 more...)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Control Systems (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Contrastive Reinforcement Learning of Symbolic Reasoning Domains

Neural Information Processing SystemsAug-15-2025, 15:33:23 GMT

Policy Learning (ConPoLe) that explicitly optimizes the InfoNCE loss, which lower bounds the mutual information between the current state and next states that continue on a path to the solution.

logic & formal reasoning, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > District of Columbia (0.04)
Europe > Belgium > Wallonia > Namur Province > Namur (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

Neural Information Processing SystemsAug-15-2025, 15:32:14 GMT

A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world. A critical challenge to such autonomy is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.

agent, intervention, irreversible state, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback