AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

A Stochastic Composite Gradient Method with Incremental Variance Reduction

Junyu Zhang, Lin Xiao

Neural Information Processing SystemsFeb-13-2026, 10:05:56 GMT

Neural Information Processing Systems http://nips.cc/

complexity, optimization, sample complexity, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Policy-Conditioned UncertaintySetsfor RobustMarkovDecisionProcesses

Neural Information Processing SystemsFeb-13-2026, 10:05:30 GMT

What policy should be employed in a Markov decision process with uncertain parameters?

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

Trust Region-Guided Proximal Policy Optimization

Yuhui Wang, Hao He, Xiaoyang Tan, Yaozhong Gan

Neural Information Processing SystemsFeb-13-2026, 10:04:27 GMT

Deep model-free reinforcement learning has achieved great successes in recent years, notably in video games [11], board games [19], robotics [10], and challenging control tasks [17,5].

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

a44ba9086b2b83ccf2baf7c678723449-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 08:58:47 GMT

bellman operator, operator, sequence, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States (0.14)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Better Exploration with Optimistic Actor Critic

Kamil Ciosek, Quan Vuong, Robert Loftin, Katja Hofmann

Neural Information Processing SystemsFeb-13-2026, 08:39:45 GMT

Actor-critic methods, a type of model-free Reinforcement Learning, have beensuccessfully applied to challenging tasks in continuous control, often achievingstate-of-the artperformance.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Denver County > Denver (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > Canada > Quebec > Montreal (0.05)
(11 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Iterative Value-Aware Model Learning

Amir-massoud Farahmand

Neural Information Processing SystemsFeb-13-2026, 08:39:22 GMT

Neural Information Processing Systems http://nips.cc/

iteration, iterv aml, learning, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

GroundedAnswersforMulti-agentDecision-making ProblemthroughGenerativeWorldModel

Neural Information Processing SystemsFeb-13-2026, 08:39:07 GMT

Theempirical results demonstrate that this framework can improve the answers for multi-agent decision-making problems by showing superior performance on the training and unseen tasks of the StarCraft Multi-Agent Challenge benchmark.

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback

Evolved Policy Gradients

Rein Houthooft, Yuhua Chen, Phillip Isola, Bradly Stadie, Filip Wolski, OpenAI Jonathan Ho, Pieter Abbeel

Neural Information Processing SystemsFeb-13-2026, 08:18:27 GMT

The idea is to evolve a differentiable loss function, such thatanagent, which optimizes itspolicytominimize thisloss, willachieve highrewards.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

Learning Latent Process from High-Dimensional Event Sequences via Efficient Sampling

Qitian Wu, Zixuan Zhang, Xiaofeng Gao, Junchi Yan, Guihai Chen

Neural Information Processing SystemsFeb-13-2026, 08:00:41 GMT

There are plenty of previous studies targeting the problem from different aspects. For temporal point process, agreat number of works [3, 13, 15, 16, 28] attempt to model the intensify function from statistic views, and recent studies harness deep recurrent model [24], generative adversarial network [23] and reinforcement learning [19, 18] to learn the temporal process. These researches mainly focus on one-dimension eventsequences where eacheventpossesses thesame marker.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback