AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Offline Behavior Distillation

Neural Information Processing SystemsOct-10-2025, 17:19:25 GMT

Inspired by dataset distillation (DD) [Wang et al., 2018, Zhao et al., (Corollary 1). Extensive experiments on nine datasets of D4RL benchmark [Fu et al., 2020] with multiple environments and data qualities illustrate that our Av-PBC remarkably promotes the OBD performance, Moreover, Av-PBC has a significant convergence speed and requires only a quarter of distillation steps compared to DBC and PBC.

algorithm, dataset, distillation, (14 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(4 more...)

Add feedback

BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 17:09:19 GMT

See more details in our project page: https://sites.google.com/view/be-cause.

arxiv preprint arxiv, equation, representation, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.67)
Information Technology (0.46)
Transportation (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(3 more...)

Add feedback

A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 17:00:44 GMT

We demonstrate that plasticity loss is pervasive under domain shift in this regime, and that a number of methods developed to resolve it in other settings fail, sometimes even performing worse than applying no intervention at all. In contrast, we find that a class of "regenerative" methods are able to consistently mitigate plasticity loss in a variety of contexts, including in gridworld tasks and

baseline, correspond, plasticity loss, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exclusively Penalized Q-learning for Offline Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 16:54:18 GMT

Reinforcement learning (RL) is gaining significant attention for solving complex Markov decision process (MDP) tasks.

dataset, learning, penalty, (14 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Lazio > Rome (0.04)
Asia > South Korea > Ulsan > Ulsan (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

cd3b5d2ed967e906af24b33d6a356cac-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 16:52:42 GMT

algorithm, architecture, bro, (14 more...)

Neural Information Processing Systems

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

cd1da8043ba5c1c144ab4e10a8de6e53-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 16:52:28 GMT

estimation, estimator, policy value, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine (0.93)
Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

cb03b5108f1c3a38c990ef0b45bc8b31-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 16:40:22 GMT

adversary, agent, sleepernet, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.68)
Banking & Finance > Trading (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

SCaR: Refining Skill Chaining for Long-Horizon Robotic Manipulation via Dual Regularization

Neural Information Processing SystemsOct-10-2025, 16:39:07 GMT

In this paper, we investigate how to achieve stable and smooth skill chaining for long-horizon robotic manipulation tasks.

experiment, regularization, sub-task skill, (14 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 16:29:55 GMT

Large vision-language models (VLMs) fine-tuned on specialized visual instruction-following data have exhibited impressive language reasoning capabilities across various scenarios.

arxiv preprint arxiv, cabinet 1, cot reasoning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 16:23:37 GMT

Reinforcement Learning (RL) agents often learn policies that do not generalise across tasks in which the environmental features and optimal skills are different [des Combes et al., 2018, Garcin et al., 2024].

infonce, sami, sance, (15 more...)

Neural Information Processing Systems

Country: