AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Learning Successor Features the Simple Way Raymond Chua

Neural Information Processing SystemsOct-10-2025, 03:26:01 GMT

In Deep Reinforcement Learning (RL), it is a challenge to learn representations that do not exhibit catastrophic forgetting or interference in non-stationary environments.

agent, basis feature, successor feature, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Education > Educational Setting (0.45)
Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration

Neural Information Processing SystemsOct-10-2025, 03:24:14 GMT

The ability to approach the same problem from different angles is a cornerstone of human intelligence that leads to robust solutions and effective adaptation to problem variations. In contrast, current RL methodologies tend to lead to policies that settle on a single solution to a given problem, making them brittle to problem variations. Replicating human flexibility in reinforcement learning agents is the challenge that we explore in this work.

diversity, duplex, experiment, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation

Neural Information Processing SystemsOct-10-2025, 03:17:00 GMT

In the Integrated Circuit (IC) design flow, floorplanning (FP) determines the position and shape of each block.

action space, aspect ratio, representation, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Semiconductors & Electronics (0.86)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate Fan-Ming Luo 1,2 Zuolin Tu

Neural Information Processing SystemsOct-10-2025, 03:06:44 GMT

Recent progress has demonstrated that recurrent reinforcement learning (RL), which consists of a context encoder based on recurrent neural networks (RNNs) for unobservable state prediction and a multilayer perceptron (MLP) policy for decision making, can mitigate partial observability and serve as a robust baseline for POMDP tasks.

context encoder, international conference, resel, (15 more...)

Neural Information Processing Systems

Country: