AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

K-level Reasoning for Zero-Shot Coordination in Hanabi

Neural Information Processing SystemsAug-14-2025, 06:33:20 GMT

Work done while at Facebook AI Research 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Figure 1: Visualization of various hierarchical training schemas, including sequential KLR, synchronous KLR, synchronous CH, and our new SyKLRBR for 4 levels.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

_NeurIPS22CloserLook (1)

yuweifu

Neural Information Processing SystemsAug-14-2025, 06:19:56 GMT

agent, arxiv preprint arxiv, representation, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

3819dd04c2c87bf0d1deea1740ef0ad5-Paper-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 06:06:56 GMT

arxiv preprint arxiv, frequency, reinforcement learning, (9 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (0.46)

Industry:

Health & Medicine (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

372593bd318ad8b34b3a8da77e20272b-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 05:41:29 GMT

demonstration, exp, lobsdice, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation

Neural Information Processing SystemsAug-14-2025, 05:41:25 GMT

We additionally assume that the agent cannot interact with the environment but has access to the action-labeled transition data collected by some agents with unknown qualities.

algorithm, demonstration, stationary distribution, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

Neural Information Processing SystemsAug-14-2025, 05:18:51 GMT

One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior. Whether we optimize for regret, sample complexity, state-space coverage or model estimation, we need to strike a different exploration-exploitation trade-off.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Instructional Material (0.34)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

Neural Information Processing SystemsAug-14-2025, 05:18:47 GMT

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Instructional Material (0.34)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations

Neural Information Processing SystemsAug-14-2025, 04:33:52 GMT

Although reinforcement learning has found widespread use in dense reward settings, training autonomous agents with sparse rewards remains challenging.

dataset, demonstration, task-specific dataset, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.93)

Technology: