AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning Cong Zhang 1, Wen Song

Neural Information Processing SystemsOct-2-2025, 02:53:11 GMT

In the paper, we adopt the Proximal Policy Optimization (PPO) algorithm [36] to train our agent. Here we provide details of our algorithm in terms of pseudo code, as shown in Algorithm 1. Similar In this section, we show how the baseline PDRs compute the priority index for the operations. Here we present the complete results on Taillard's benchmark. In Table S.1, we report the results of In Table S.2, we report the generalization performance of our polices trained on The "UB" column is the best solution from The "UB" column is the best solution from Similar conclusion can be drawn from results on DMU benchmark. In Table S.3, we report results In Table S.4 which focuses on The "UB" column is the best solution from The "UB" column is the best solution from We show training curves for all problems in Figure.1.

artificial intelligence, machine learning, reinforcement learning cong zhang 1, (9 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

11958dfee29b6709f48a9ba0387a2431-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 02:53:05 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Adaptive Auxiliary Task Weighting for Reinforcement Learning

Xingyu Lin, Harjatin Baweja, George Kantor, David Held

Neural Information Processing SystemsOct-2-2025, 02:20:51 GMT

Neural Information Processing Systems http://nips.cc/

auxiliary task, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generalized Off-Policy Actor-Critic

Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson

Neural Information Processing SystemsOct-2-2025, 01:50:44 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Unifying View of Optimism in Episodic Reinforcement Learning

Neural Information Processing SystemsOct-2-2025, 01:41:42 GMT

In this paper, we provide a new framework for studying this class of algorithms. Optimistic algorithms are built upon the principle of "optimism in the face of uncertainty" (OFU).

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.28)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Appendix B, we provide sufficient conditions for Assumption 1 that were mentioned in the main

Neural Information Processing SystemsOct-2-2025, 01:11:31 GMT

In Appendix A we introduce some basic definitions that are needed for our theoretical results. In Appendix C and Appendix D we prove the error bounds for PPI and PQI. All the other dynamics are preserved. Rewards are 0 for the absorbing action and unchanged elsewhere. Algorithm 1 and 2. As some of the notations is actually a function of the MDP, we clarify the usage Recall the definition of semi-norm of any function of state-action pairs.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Genre: Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

Provably Good Batch Reinforcement Learning Without Great Exploration

Neural Information Processing SystemsOct-2-2025, 01:11:24 GMT

Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes tasks.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

YiDing Jiang, Shixiang (Shane) Gu, Kevin P. Murphy, Chelsea Finn

Neural Information Processing SystemsOct-2-2025, 01:07:58 GMT

Solving complex, temporally-extended tasks is a long-standing problem in reinforcement learning (RL).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Conservative Q-Learning for Offline Reinforcement Learning A viral Kumar

Neural Information Processing SystemsOct-2-2025, 00:59:14 GMT

Effectively leveraging large, previously collected datasets in reinforcement learning (RL) is a key challenge for large-scale real-world applications. Offline RL algorithms promise to learn effective policies from previously-collected, static datasets without further interaction.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback