AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

ProvablyEfficientReinforcementLearningwith LinearFunctionApproximationunderAdaptivity Constraints

Neural Information Processing SystemsFeb-9-2026, 07:55:19 GMT

Real-world reinforcement learning (RL) applications often come with possibly infinite state and action space, and in such a situation classical RL algorithms developed in the tabular setting are not applicable anymore. A popular approach to overcoming this issue is by applying function approximation techniques to the underlying structures of the Markovdecision processes (MDPs).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Add feedback

Settlingthe Varianceof Multi-Agent Policy Gradients

Neural Information Processing SystemsFeb-9-2026, 07:37:00 GMT

InProceedingsofthe First International Conferenceon Distributed Artificial Intelligence, pages 1-7, 2019.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

29906cbd165b78991da2c4dbabc2a04b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 07:34:46 GMT

arxiv preprint arxiv, reinforcement, threshold, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)

Add feedback

6f5e4e86a87220e5d361ad82f1ebc335-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 07:26:56 GMT

dimension, eluder dimension, function approximation, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

5d2781cc34f459618a9a504761043055-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 07:06:54 GMT

Importantly, E2 is bounded independently ofS.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

89b9e0a6f6d1505fe13dea0f18a2dcfa-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 06:56:56 GMT

Weevaluate PI-SAC agents by comparing against uncompressed PI-SAC agents, other compressed and uncompressed agents, and SAC agents directly trained from pixels.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

8977ecbb8cb82d77fb091c7a7f186163-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 06:48:35 GMT

agent, arxiv preprint arxiv, credit assignment, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.82)

Add feedback

OnlineDecisionBasedVisualTrackingvia ReinforcementLearning

Neural Information Processing SystemsFeb-9-2026, 06:34:04 GMT

A deep visual tracker is typically based on either object detection or template matching while each of them is only suitable for a particular group of scenes. It is straightforward to consider fusing them together to pursue more reliable tracking. However, this is not wise as they follow different tracking principles.

machine learning, reinforcement learning, tracker, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Shandong Province > Jinan (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

884d247c6f65a96a7da4d1105d584ddd-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 06:33:31 GMT

DDPG [24]extends Q-learning to continuous control based on the Deterministic Policy Gradient [31] algorithm, which learns a deterministic policyπ(s;φ) parameterized byφto maximize the Q-function to approximate themaxoperator.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: