AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

60106888f8977b71e1f15db7bc9a88d1-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 19:03:24 GMT

international conference, learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

Neural Information Processing SystemsAug-14-2025, 19:03:02 GMT

Reinforcement learning (RL) algorithms assume that users specify tasks by manually writing down a reward function.

classifier, example-based control, reward function, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

5fd2c06f558321eff612bbbe455f6fbd-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 18:57:27 GMT

arxiv preprint arxiv, learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

5f7f02b7e4ade23430f345f954c938c1-Supplemental.pdf

Neural Information Processing SystemsAug-14-2025, 18:42:44 GMT

action entropy, arxiv preprint arxiv, eddict, (14 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Industry: Leisure & Entertainment > Games (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Add feedback

5f7f02b7e4ade23430f345f954c938c1-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 18:42:40 GMT

eddict, learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry:

Leisure & Entertainment > Games (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Supplementary Material for HandMeThat: Human-Robot Communication in Physical and Social Environments Y anming Wan

Neural Information Processing SystemsAug-14-2025, 18:21:31 GMT

In Section A, we provide the detailed information for HandMeThat data generation and its textual interface. In Section B, we summarize the statistics of the dataset. Recall that HandMeThat uses an object-centric representation for states. "Location" consists of all non-movable entities. Each class (except for "location") is composed of multiple subclasses, and each subclass contains In total, there are 155 object categories. Each object category is also associated with several attributes.

agent, category, dataset, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Workflow (0.67)

Industry: Consumer Products & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.41)

Add feedback

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

Neural Information Processing SystemsAug-14-2025, 18:20:49 GMT

Despite these successes, applying RL techniques to complex control problems remains a daunting undertaking, where initial attempts often result in underwhelming performance.

agent, reward component, value decomposition, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Appendix: Continuous Doubly Constrained Batch Reinforcement Learning A Experiment Details Evaluation Procedure

Neural Information Processing SystemsAug-14-2025, 17:50:09 GMT

Since the Bellman-evaluation operator is also a contraction under standard conditions [3, 8, 31], our overall argument remains otherwise intact.D.2 Proof of Theorem 2.

behavior policy, cdc, dataset, (11 more...)

Neural Information Processing Systems

Country: