AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Near-OptimalRegretforAdversarialMDPwith DelayedBanditFeedback

Neural Information Processing SystemsFeb-12-2026, 05:29:41 GMT

The standard assumption in reinforcement learning (RL) is that agents observe feedback for their actions immediately. However, in practice feedback is often observedindelay.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Nevada (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

5a1667459d0cdeb2fe6b2f0dffc5cb9d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 05:21:02 GMT

data mining, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > Maryland > Baltimore (0.04)
Europe > Austria (0.04)
(11 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Real-Time Reinforcement Learning

Simon Ramstedt, Chris Pal

Neural Information Processing SystemsFeb-12-2026, 05:20:46 GMT

While it is well suited to describe turn-based decision problems such as board games, this framework is ill suited for real-time applications in which the environment's state continues to evolve while the agent selects an action (Travnik et al., 2018). Nevertheless, this framework hasbeen used forreal-time problems using what areessentially tricks, e.g.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

DistributedInverseConstrainedReinforcement LearningforMulti-agentSystems

Neural Information Processing SystemsFeb-12-2026, 05:20:06 GMT

We formally guarantee that the distributed learners asymptotically achieve consensus which belongs to the set of stationary points of the bi-level optimization problem.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions Rui Y ang

Neural Information Processing SystemsFeb-12-2026, 05:08:54 GMT

These Q values are often objectives that offline algorithms aim to approximate.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Europe > Portugal (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

45fc4a0da7e7f6fbabaabe2d20a441d1-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 05:02:10 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)

Add feedback

Learning to Discover Skills through Guidance Hyunseung Kim,1 Byungkun Lee,1 Hojoon Lee

Neural Information Processing SystemsFeb-12-2026, 05:01:11 GMT

However, we have identified that the effectiveness of these rewards declines as the environmental complexity rises. Therefore, we present a novel USD algorithm, skill disco very with gui dance ( DISCO-DANCE), which (1) selects the guide skill that possesses the highest potential to reach unexplored states, (2) guides other skills to follow guide skill, then (3) the guided skills are dispersed to maximize their discriminability in unexplored states. Empirical evaluation demonstrates that DISCO-DANCE outperforms other USD baselines in challenging environments, including two navigation benchmarks and a continuous control benchmark.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
(2 more...)

Add feedback

Addressing Sample Complexity in Visual Tasks Using HER and Hallucinatory GANs

Himanshu Sahni, Toby Buckley, Pieter Abbeel, Ilya Kuzovkin

Neural Information Processing SystemsFeb-12-2026, 04:49:03 GMT

To this end, Andrychowicz et al.[1] introduced Hindsight Experience Replay (HER), which can rapidly train goal-conditioned policies by retroactively imagining failed trajectories as successful ones.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: