AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

65beb73449888fabcf601b3a3ef4b3a7-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 10:09:09 GMT

international conference, learning, trajectory, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.46)

Add feedback

6530274c68e81047e1f4a2ceb0b8c0ef-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 09:33:22 GMT

approximation, assumption, experiment, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > Tompkins County > Ithaca (0.04)
Europe > Russia (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method

Neural Information Processing SystemsAug-15-2025, 09:33:03 GMT

Based on our regularization, we propose an off-policy actor-critic algorithm.

arxiv preprint, policy architecture, regularization, (14 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning Agents

Neural Information Processing SystemsAug-15-2025, 09:12:53 GMT

The high variance stems from the lack of structural credit assignment, i.e. a single scalar

agent, backprop, map propagation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

Neural Information Processing SystemsAug-15-2025, 08:54:17 GMT

In other words, the assumptions in these works imply that the state space is already well-explored.

algorithm, arxiv preprint arxiv, international conference, (9 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey (0.04)
North America > Canada (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

On the Importance of Exploration for Generalization in Reinforcement Learning

Neural Information Processing SystemsAug-15-2025, 08:49:23 GMT

We hypothesize that the agent's exploration strategy plays a key role in its ability to generalize to new environments.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.93)
Energy > Oil & Gas > Upstream (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

Add feedback

Sequential Causal Imitation Learning with Unobserved Confounders

Neural Information Processing SystemsAug-15-2025, 08:49:15 GMT

"Monkey see monkey do" is an age-old adage, referring to naïve imitation without a

adjustment, ancestor, imitator, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Instructional Material (0.46)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

What Matters for Adversarial Imitation Learning?

Neural Information Processing SystemsAug-15-2025, 08:47:50 GMT

Adversarial imitation learning has become a popular framework for imitation in continuous control.

configuration, percentile, performance score, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

7b647a7d88f4d6319bf0d600d168dbeb-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 08:47:46 GMT

arxiv preprint arxiv, demonstration, regularizer, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Omitted Proofs

Neural Information Processing SystemsAug-15-2025, 08:16:04 GMT

The proofs of these propositions are extended from Berlekamp (1968). Note that both oracle's preference feedback and We adopt the environment setting created by Rothfuss et al. (2019). MuJoCo locomotion tasks, where the reward function are varied to create a multi-task setting. The training and testing tasks are randomly generated by a fixed random seed. During meta-training, the meta-RL algorithm has the full access to the environmental interaction.

pearl, performance evaluation, trajectory pair, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback