AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

81e793dc8317a3dbc3534ed3f242c418-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 09:56:25 GMT

disco, exploration, sample complexity, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Better Exploration with Optimistic Actor Critic

Kamil Ciosek, Quan Vuong, Robert Loftin, Katja Hofmann

Neural Information Processing SystemsOct-3-2025, 09:46:20 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, exploration policy, learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Denver County > Denver (0.14)
North America > Canada > Quebec > Montreal (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(14 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

7ec2442aa04c157590b2fa1a7d093a33-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 08:47:56 GMT

classifier, elicitation, reviewer, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex Objectives

Wang Chi Cheung

Neural Information Processing SystemsOct-3-2025, 08:38:07 GMT

Due to state transitions, it is challenging to balance the contribution from each dimension for achieving near-optimality.

agent, proceedings, tfw-ucrl2, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

Neural Information Processing SystemsOct-3-2025, 08:27:49 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

algorithm, behavior policy, sarsa algorithm, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

Reconciling λ-Returns with Experience Replay

Brett Daley, Christopher Amato

Neural Information Processing SystemsOct-3-2025, 08:18:50 GMT

A unique benefit to this approach is that each transition's TD error can be

arxiv preprint arxiv, learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning

Neural Information Processing SystemsOct-3-2025, 07:37:52 GMT

Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents.

agent, learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology: