The Value of Reward Lookahead in Reinforcement Learning

Mar-21-2026, 17:36:22 GMT–Neural Information Processing Systems

In reinforcement learning (RL), agents sequentially interact with changing environments while aiming to maximize the obtained rewards. Usually, rewards are observed only acting, and so the goal is to maximize the cumulative reward. Yet, in many practical settings, reward information is observed in advance -- prices are observed before performing transactions; nearby traffic information is partially known; and goals are oftentimes given to agents prior to the interaction. In this work, we aim to quantifiably analyze the value of such future reward information through the lens of _competitive analysis.

artificial intelligence, proceedings, reinforcement learning, (5 more...)

Neural Information Processing Systems

Mar-21-2026, 17:36:22 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)