AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

349a45f211fb1b3850da1ccd829e869e-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:55:53 GMT

adversary, ardt, trajectory, (12 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry:

Leisure & Entertainment > Games (0.68)
Information Technology (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
(3 more...)

Add feedback

333a7697dbb67f09249337f81c27d749-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:48:35 GMT

arxiv preprint arxiv, behavior policy, dataset, (11 more...)

Neural Information Processing Systems

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots (0.92)

Add feedback

RL-GPT: Integrating Reinforcement Learning and Code-as-policy

Neural Information Processing SystemsOct-9-2025, 22:43:44 GMT

Additionally, it achieves good performance on designated MineDojo tasks.

agent, arxiv preprint arxiv, language model, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

31888563b194f9bb33ce1aebc7e1551c-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:42:23 GMT

agent, algorithm, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Hong Kong (0.04)
South America > Brazil > São Paulo (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry:

Information Technology (0.92)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Add feedback

317ccced29ed464df181c781cb436180-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:42:05 GMT

experiment, iteration, state space, (16 more...)

Neural Information Processing Systems

Country: Europe > Austria > Styria > Graz (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

30df5ecdd245bd2f4b0c5ba48de9674a-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:34:28 GMT

irl, minimaxregret, reward function, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

A theoretical case-study of Scalable Oversight in Hierarchical Reinforcement Learning

Neural Information Processing SystemsOct-9-2025, 22:32:49 GMT

To this end, we study the challenges of scalable oversight in the context of goal-conditioned hierarchical reinforcement learning.

algorithm, arxiv preprint arxiv, low-level feedback, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Inverse Factorized Soft Q-Learning for Cooperative Multi-agent Imitation Learning

Neural Information Processing SystemsOct-9-2025, 22:26:17 GMT

In a single-agent setting, IL was shown to be done efficiently via an inverse soft-Q learning process.

agent, algorithm, learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Oregon > Lane County > Eugene (0.14)
Asia > Singapore (0.04)
North America > United States > Ohio > Lucas County > Oregon (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.67)
Leisure & Entertainment > Games (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

2f0bb736ccc8551ef5bcc9165c2a4d9e-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:24:10 GMT

algorithm, confidence region, pmevi-dt, (14 more...)

Neural Information Processing Systems

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)

Genre: Research Report > Experimental Study (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm

Neural Information Processing SystemsOct-9-2025, 22:04:25 GMT

Reinforcement learning utilizing kernel ridge regression to predict the expected value function represents a powerful method with great representational capacity. This setting is a highly versatile framework amenable to analytical results. We consider kernel-based function approximation for RL in the infinite horizon average reward setting, also referred to as the undiscounted setting.

algorithm, assumption, confidence interval, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

Add feedback