AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

131f383b434fdf48079bff1e44e2d9a5-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 12:47:00 GMT

See Table 1for the average running time per problem instance. Note that the implementation of Z3 and OR-tools22 are in C++, while NeuRewriter and RL baselines are in Python. Still, we can observethat our approach achieves a23 better balance between the time-efficiency and the result quality. For expression simplification and job scheduling,24 NeuRewriter is even more time-efficient than Z3 and OR-tools. The region-pickerπω is parameterized by aQ-function and is similar in spirit to soft-Q learning [2].

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

b5c8c1c117618267944b2617add0a766-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 12:17:50 GMT

agent, influence distribution, simulator, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > South Holland > Delft (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Discovery of Useful Questions as Auxiliary Tasks

Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado P. van Hasselt, David Silver, Satinder Singh

Neural Information Processing SystemsFeb-11-2026, 12:15:33 GMT

Neural Information Processing Systems http://nips.cc/

agent, auxiliary task, representation, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > Canada > Alberta (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(3 more...)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Model-Based Opponent Modeling Xiaopeng Y u Jiechuan Jiang Wanpeng Zhang Haobin Jiang Zongqing Lu School of Computer Science, Peking University

Neural Information Processing SystemsFeb-11-2026, 12:05:23 GMT

When one agent interacts with a multi-agent environment, it is challenging to deal with various opponents unseen before.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.93)

Industry: Leisure & Entertainment > Games > Computer Games (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

Oracle Inequalitiesfor Model Selection in Offline Reinforcement Learning

Neural Information Processing SystemsFeb-11-2026, 11:56:56 GMT

Define = log (M2H / ).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.41)

Add feedback

Adaptive Auxiliary Task Weighting for Reinforcement Learning

Xingyu Lin, Harjatin Baweja, George Kantor, David Held

Neural Information Processing SystemsFeb-11-2026, 11:36:25 GMT

Deep reinforcement learning has enjoyed recent success in domains like games [1, 2], robotic manipulation,andlocomotiontasks[3,4].

auxiliary task, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)

Add feedback

ProvableModel-based NonlinearBanditand ReinforcementLearning: ShelveOptimism,Embrace VirtualCurvature

Neural Information Processing SystemsFeb-11-2026, 11:36:04 GMT

A key algorithmic insight is that optimism may lead to over-exploration even for two-layer neural net model class.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Suffolk County > Chelsea (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

RobustDeepReinforcementLearning throughAdversarialLoss

Neural Information Processing SystemsFeb-11-2026, 11:35:22 GMT

Our RADIAL-RL agents consistently outperform prior methods when tested against attacks of varying strength and are more computationally efficient to train. In addition, we propose a new evaluation method calledGreedyWorst-Case Reward(GWC) tomeasure attack agnostic robustness of deep RL agents. We show that GWC can be evaluated efficiently and is a good estimate of the reward under the worst possible sequence of adversarial attacks.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback