AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

7695ea769f021803c508817dd374bb27-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 06:28:11 GMT

algorithm, linear function approximation, sample complexity, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

970627414218ccff3497cb7a784288f5-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 06:23:18 GMT

baseline, gcn, potential function, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report (0.95)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

On Gap-dependent Bounds for Offline Reinforcement Learning

Neural Information Processing SystemsAug-15-2025, 05:35:47 GMT

Instead, we have access to a dataset generated from some past suboptimal policies.

assumption, optimal policy, reinforcement learning, (11 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Colosseum

Neural Information Processing SystemsAug-15-2025, 05:20:08 GMT

The performances of the agents are in line with the results reported in Osband et al.

agent, psrl 0, q-learning 0, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.58)

Add feedback

Hardness in Markov Decision Processes: Theory and Practice

Neural Information Processing SystemsAug-15-2025, 05:20:03 GMT

Finally, we benchmark five tabular agents in our newly proposed benchmark.

agent, complexity, hardness, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

Neural Information Processing SystemsAug-15-2025, 05:18:56 GMT

While this is a common approach in supervised learning, to our knowledge, this has not been discussed in detail in the offline RL setting.

algorithm, dataset, selection, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Appendix A.1 Additional Method Justification The key idea of Q

Neural Information Processing SystemsAug-15-2025, 05:17:29 GMT

This problem has been studied in stochastic optimal control, particularly REPS [Peters et al., 2010]. In our experiments, we use soft actor-critic [Haarnoja et al., 2018] as our base RL algorithm. The policy and critic networks are MLPs with 2 fully-connected hidden layers of size 256. Following [Sharma et al., 2021b], we use a biased TD update, where For all experiments using prior data collected through RL, the agent was initialized at test time with the pretrained policy and critic. The details for this environment are in [Sharma et al., 2021b].

additional method justification, agent, online trial, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

You Only Live Once: Single-Life Reinforcement Learning Annie S. Chen

Neural Information Processing SystemsAug-15-2025, 05:17:26 GMT

For example, imagine a disaster relief robot tasked with retrieving an item from a fallen building, where it cannot get direct supervision from humans. It must retrieve this object within one test-time trial, and must do so while tackling unknown obstacles, though it may leverage knowledge it has of the building before the disaster.

agent, arxiv preprint arxiv, learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.46)

Industry: Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Contrastive Active Inference

Neural Information Processing SystemsAug-15-2025, 04:56:12 GMT

In contrast, reinforcement learning requires human-designed rewards to accomplish any desired outcome.

agent, inference, trajectory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Sardinia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

Safe Reinforcement Learning by Imagining the Near Future

Neural Information Processing SystemsAug-15-2025, 04:55:50 GMT

In this work, we focus on the setting where unsafe states can be avoided by planning ahead a short time into the future. In this setting, a model-based agent with a sufficiently accurate model can avoid unsafe states.

algorithm, arxiv preprint arxiv, safety violation, (11 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)

Add feedback