Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

Nov-21-2025, 14:07:58 GMT–Neural Information Processing Systems

The environment and an agent's interactions are typically modeled as a Markov

algorithm, mdp, state-action pair, (15 more...)

Neural Information Processing Systems

Nov-21-2025, 14:07:58 GMT

Conferences PDF

Country:
- Asia > Middle East > Jordan (0.04)

Genre:
- Research Report > New Finding (0.45)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Duplicate Docs Excel Report

Title
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

Similar Docs Excel Report more

Title	Similarity	Source
None found