AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

EDGE: Explaining Deep Reinforcement Learning Policies S1 Additional Technical Details

Neural Information Processing SystemsFeb-9-2026, 02:30:17 GMT

Note that these games are two-player games, we select the runner in You-Shall-Not-Pass and kicker in Kick-And-Defend as our target agent. Section 4 mentioned that we download a well-trained policy for each game.

explanation, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > North Carolina > Wake County > Raleigh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry:

Education (0.34)
Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

EDGE: ExplainingDeepReinforcementLearning Policies

Neural Information Processing SystemsFeb-9-2026, 02:30:13 GMT

Deep reinforcement learning has shown great success in automatic policy learning for various sequential decision-making problems, such as training AI agents to defeat professional players in sophisticated games [74, 65, 24, 37] and controlling robots to accomplish complicated tasks [33, 38].

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

FACMAC: FactoredMulti-AgentCentralised PolicyGradients

Neural Information Processing SystemsFeb-9-2026, 02:15:50 GMT

However, FACMAClearnsacentralised butfactored critic,which combines per-agent utilities into the joint action-value function via a non-linear monotonic function, as inQMIX, apopular multi-agentQ-learning algorithm. However,unlikeQMIX, there are no inherent constraints on factoring the critic. We thus also employ a nonmonotonic factorisation and empirically demonstrate that its increased representational capacity allows it to solve some tasks that cannot be solved with monolithic, ormonotonically factored critics.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Switzerland (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)

Add feedback

559726fdfb19005e368be4ce3d40e3e5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 02:01:06 GMT

arxiv preprint arxiv, optimization, representation, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement Learning

Ali Taghibakhshi, Scott MacLachlan, Luke Olson, Matthew West

Neural Information Processing SystemsFeb-9-2026, 01:59:16 GMT

graph, grid, node, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Contextual Bandits and Imitation Learning with Preference-Based Active Queries

Neural Information Processing SystemsFeb-9-2026, 01:45:44 GMT

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Contextual Bandits and Imitation Learning with Preference-Based Active Queries

Neural Information Processing SystemsFeb-9-2026, 01:45:40 GMT

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

2561721d0ca69bab22b749cfc4f48f6c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 01:33:35 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
(14 more...)

Genre: Research Report (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

EmergentGraphicalConventionsin aVisualCommunicationGame

Neural Information Processing SystemsFeb-9-2026, 01:32:51 GMT

Due to itsiconic nature (i.e., perceptual resemblance to or natural association with the referent), drawings serve as a powerful tool to communicate concepts transcending language barriers (Fay et al., 2014). In fact, we humans started to use drawings to convey messages dating back to 40,000-60,000 years ago (Hoffmann et al., 2018; Hawkins et al., 2019).

convention, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Technology: