AITopics | kernel loss

Neural Information Processing Systems http://nips.cc/

algorithm, international conference, value function, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Lebanon > Beqaa Governorate > Zahlé (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Rocky Mountains (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.82)

Add feedback

A Kernel Loss for Solving the Bellman Equation

Yihao Feng, Lihong Li, Qiang Liu

Neural Information Processing SystemsOct-2-2025, 16:13:46 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

Reviews: A Kernel Loss for Solving the Bellman Equation

Neural Information Processing SystemsJan-23-2025, 11:02:27 GMT

Originality: The derivation of the loss function is original; the resulting loss function has some close similarities with the coupled formulation of LSTD, which should be discussed. Quality: The claims seem to be accurate (I briefly verified the proofs of Theorem 3.1, Proposition 3.3, Proposition 3.4; I did not verify Theorem 3.2 and Corollary 3.5). Clarity: The paper is well-written and clear. Significance: The addressed problem is important; the insights are also useful. SUMMARY: The paper addresses the problem of designing a new loss function for RL.

bellman error, loss function, proposition 3, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

Reviews: A Kernel Loss for Solving the Bellman Equation

Neural Information Processing SystemsJan-23-2025, 11:02:16 GMT

There is general consensus that the idea introduced in the paper is novel and interesting. Yet, I encourage the authors to read carefully the reviewers' comments and take them into consideration in the camera ready. In particular, the connection with the nested formulation of LSTD should be discussed to frame the contribution of the paper better.

bellman equation, kernel loss

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Add feedback

A Kernel Loss for Solving the Bellman Equation

Neural Information Processing SystemsOct-9-2024, 23:28:40 GMT

Value function learning plays a central role in many state-of-the-art reinforcement learning algorithms. Many popular algorithms like Q-learning do not optimize any objective function, but are fixed-point iterations of some variants of Bellman operator that are not necessarily a contraction. As a result, they may easily lose convergence guarantees, as can be observed in practice. In this paper, we propose a novel loss function, which can be optimized using standard gradient-based methods with guaranteed convergence. The key advantage is that its gradient can be easily approximated using sampled transitions, avoiding the need for double samples required by prior algorithms like residual gradient.

algorithm, bellman equation, kernel loss, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Kernel Loss for Solving the Bellman Equation

Feng, Yihao, Li, Lihong, Liu, Qiang

Neural Information Processing SystemsMar-19-2020, 03:03:09 GMT

Value function learning plays a central role in many state-of-the-art reinforcement learning algorithms. Many popular algorithms like Q-learning do not optimize any objective function, but are fixed-point iterations of some variants of Bellman operator that are not necessarily a contraction. As a result, they may easily lose convergence guarantees, as can be observed in practice. In this paper, we propose a novel loss function, which can be optimized using standard gradient-based methods with guaranteed convergence. The key advantage is that its gradient can be easily approximated using sampled transitions, avoiding the need for double samples required by prior algorithms like residual gradient.

algorithm, bellman equation, kernel loss, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Kernel Loss for Solving the Bellman Equation

Feng, Yihao, Li, Lihong, Liu, Qiang

arXiv.org Machine LearningMay-24-2019

Value function learning plays a central role in many state-of-the-art reinforcement-learning algorithms. Many popular algorithms like Q-learning do not optimize any objective function, but are fixed-point iterations of some variant of Bellman operator that is not necessarily a contraction. As a result, they may easily lose convergence guarantees, as can be observed in practice. In this paper, we propose a novel loss function, which can be optimized using standard gradient-based methods without risking divergence. The key advantage is that its gradient can be easily approximated using sampled transitions, avoiding the need for double samples required by prior algorithms like residual gradient. Our approach may be combined with general function classes such as neural networks, on either on- or off-policy data, and is shown to work reliably and effectively in several benchmarks.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1905.10506

Country:

North America > United States (0.46)
North America > Canada > Alberta (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Online learning with kernel losses

Pacchiano, Aldo, Chatterji, Niladri S., Bartlett, Peter L.

arXiv.org Machine LearningFeb-27-2018

We present a generalization of the adversarial linear bandits framework, where the underlying losses are kernel functions (with an associated reproducing kernel Hilbert space) rather than linear functions. We study a version of the exponential weights algorithm and bound its regret in this setting. Under conditions on the eigendecay of the kernel we provide a sharp characterization of the regret for this algorithm. When we have polynomial eigendecay $\mu_j \le \mathcal{O}(j^{-\beta})$, we find that the regret is bounded by $\mathcal{R}_n \le \mathcal{O}(n^{\beta/(2(\beta-1))})$; while under the assumption of exponential eigendecay $\mu_j \le \mathcal{O}(e^{-\beta j })$, we get an even tighter bound on the regret $\mathcal{R}_n \le \mathcal{O}(n^{1/2}\log(n)^{1/2})$. We also study the full information setting when the underlying losses are kernel functions and present an adapted exponential weights algorithm and a conditional gradient descent algorithm.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1802.09732

Country: