AITopics | Reinforcement Learning

The Value Equivalence Principle for Model-Based Reinforcement Learning

Neural Information Processing SystemsFeb-8-2026, 03:45:48 GMT

Learning models of the environment from data is often viewed as an essential component to building intelligent reinforcement learning (RL) agents. The common practice is to separate the learning of the model from its use, by constructing a model of the environment's dynamics that correctly predicts the observed state transitions. In this paper we argue that the limited representational resources of model-based RL agents are better used to build models that are directly useful for value-based planning.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Provably Efficient Algorithm for Nonstationary Low-Rank MDPs

Neural Information Processing SystemsFeb-8-2026, 03:45:34 GMT

However, in practice, the environment is typically time-varying and nonstationary .

data mining, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Ohio (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Robots (0.67)
(2 more...)

Add feedback

145c28cd4b1df9b426990fd68045f4f7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 03:45:31 GMT

equation, transition kernel, variation budget, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Ohio (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Robots (0.67)
(2 more...)

Add feedback

3b2acfe2e38102074656ed938abf4ac3-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 03:45:25 GMT

algorithm, gradient method, stochastic game, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)

Genre:

Overview (0.67)
Research Report (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Independent Policy Gradient Methods for Competitive Reinforcement Learning

Neural Information Processing SystemsFeb-8-2026, 03:45:17 GMT

MinimaxvShapley[63]showed gameG, thereexists( 1, 2)suchthat V ( 1, 2) V ( 1, 2) V ( 1, 2), forall 1, 2, (1) andinparticularV = min 1max 2V ( 1, 2)=max 2min 1V ( 1, 2). Thecruxxplayer timescalethany-player, they-player Compared 43], whichestablishesy-player gradientdominancey-player' ofthegradient t, (y) = ( f(xt, )) (y), then averageusing Is Q-learningprovably Inin Neural Information Processing Systems, pages 4863-4873, 2018.

machine learning, machine learning research, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

2e0f5561c1553a97cee5fa64575358c9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 03:17:00 GMT

international conference, uncertainty parameter, worst-case performance, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.05)
(22 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

119b45b5c2020d6bc9bca1e42826a2b3-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 02:47:55 GMT

Despite its potential, offline RL faces twosignificant challenges that impact its performance.

machine learning, pas, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Erie County > Buffalo (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)

Genre: Research Report (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

385822e359afa26d52b5b286226f2cea-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 02:36:58 GMT

In contrast, classical graphical methods like A* search are able to solve long-horizon tasks, but assume that the state space is abstracted away from raw sensory input. Recent works have attempted to combine the strengths of deep learning and classical planning; however, dominant methods in this domain are stillquite brittle andscale poorly withthesizeoftheenvironment.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: