AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

Neural Information Processing SystemsFeb-17-2026, 04:23:54 GMT

In order to overcome overestimation bias, ensemble methods for Q-learning have been investigated to exploit the diversity of multiple Q-functions. Since network initialization has been the predominant approach to promote diversity in Q-functions, heuristically designed diversity injection methods have been studied in the literature. However, previous studies have not attempted to approach guaranteed independence over an ensemble from a theoretical perspective.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

a439259e78294c38d157a51a2c40486b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 04:22:13 GMT

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > South Carolina (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
(3 more...)

Add feedback

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Neural Information Processing SystemsFeb-17-2026, 03:58:37 GMT

The high-capacity generative models trained on large, diverse datasets have demonstrated remarkable success across vision and language tasks.

large language model, machine learning, mtd iff, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reward Imputation with Sketching for Contextual Batched Bandits

Neural Information Processing SystemsFeb-17-2026, 03:33:16 GMT

Contextual batched bandit (CBB) is a setting where a batch of rewards is observed from the environment at the end of each episode, but the rewards of the non-executed actions are unobserved, resulting in partial-information feedback.

data mining, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

EDGI: Equivariant Diffusion for Planning with Embodied Agents Johann Brehmer Qualcomm AI Research

Neural Information Processing SystemsFeb-17-2026, 01:58:38 GMT

We integrate this model in a planning loop, where conditioning and classifier guidance let us softly break the symmetry for specific tasks as needed.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.68)

Industry:

Telecommunications (0.41)
Semiconductors & Electronics (0.41)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.50)
(2 more...)

Add feedback

Appendix A Control algorithm The action-value function can be decomposed into two components as: Q (PT) (s, a) = Q (P) (s, a) + Q (T) w

Neural Information Processing SystemsFeb-17-2026, 01:58:25 GMT

We use induction to prove this statement. The penultimate step follows from the induction hypothesis completing the proof. Then, the fixed point of Eq.(5) is the value function of in f M . We focus on permanent value function in the next two theorems. The permanent value function is updated using Eq.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Prediction and Control in Continual Reinforcement Learning

Neural Information Processing SystemsFeb-17-2026, 01:58:21 GMT

Deep reinforcement learning (RL) has achieved remarkable successes in complex tasks, e.g Go [

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
North America > Barbados (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre:

Workflow (0.46)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

a0da690a47b2f52faa63f6fe054057b5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 01:42:55 GMT

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication

Neural Information Processing SystemsFeb-17-2026, 01:23:48 GMT

Furthermore, the learned communication protocols exhibit zero-shot generalization capabilities in ad-hoc teamwork scenarios with unseen teammates and novel task states.

large language model, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: