Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Feb-15-2026, 11:55:51 GMT–Neural Information Processing Systems

The interaction is usually modeled as Markov Decision Processes (MDPs). Research on MDPs can be broadly divided into two lines based on the reward generation mechanism. The first line of work [Jaksch et al., 2010, Azar et al., 2013, 2017, He et al., 2021] considers the

dynamic regret, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Feb-15-2026, 11:55:51 GMT

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China > Jiangsu Province
    - Nanjing (0.04)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (0.47)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)

Duplicate Docs Excel Report

Title
Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Similar Docs Excel Report more

Title	Similarity	Source
None found