AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Adversarial Attacks on Stochastic Bandits

Kwang-Sung Jun, Lihong Li, Yuzhe Ma, Jerry Zhu

Neural Information Processing SystemsFeb-13-2026, 13:36:13 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, alice, cumulative attack cost, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination Xihuai Wang

Neural Information Processing SystemsFeb-13-2026, 12:55:22 GMT

The significant difference between the deployment-time partners' distribution and the training partners' distribution determined by the

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > Portugal (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator

Karl Krauth, Stephen Tu, Benjamin Recht

Neural Information Processing SystemsFeb-13-2026, 12:37:25 GMT

Neural Information Processing Systems http://nips.cc/

abbasi-y adkori, algorithm, iteration, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Distributed Multitask Reinforcement Learning with Quadratic Convergence

Rasul Tutunov, Dongho Kim, Haitham Bou Ammar

Neural Information Processing SystemsFeb-13-2026, 11:35:59 GMT

Multitask reinforcement learning (MTRL) suffers from scalability issues when the number of tasks or trajectories grows large.

machine learning, reinforcement learning, trajectory, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

6a7c2a320f5f36bb98f8eb878c6f1180-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 11:14:51 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

Andrea Zanette, Mykel J. Kochenderfer, Emma Brunskill

Neural Information Processing SystemsFeb-13-2026, 10:57:54 GMT

Inparticular,well knownbounds foronline learningscale as a function of the gap between the expected reward of a particular action and the optimalaction [ABF02] and also on the variance ofthe rewards [AMS09].

cisa, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: