Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization

Kong, Fang, Zhang, Xiangcheng, Wang, Baoxiang, Li, Shuai

Feb-14-2023–arXiv.org Artificial Intelligence

Reinforcement learning (RL) describes the interaction between a learning agent and an unknown environment, where the agent aims to maximize the cumulative reward through trial and error Sutton and Barto [2018]. It has achieved great success in many real applications, such as games [Mnih et al., 2013; Silver et al., 2016], robotics [Kober et al., 2013; Lillicrap et al., 2015], autonomous driving [Kiran et al., 2021] and recommendation systems [Afsar et al., 2022; Lin et al., 2021]. The interaction in RL is commonly portrayed by Markov decision processes (MDP). Most of the works study the stochastic setting, where the reward is sampled from a fixed distribution [Azar et al., 2017; Jin et al., 2018; Simchowitz and Jamieson, 2019; Yang et al., 2021]. RL in real applications is in general more challenging than the stochastic setting, as the environment could be nonstationary and the reward function could be adaptive towards the agent's policy. For example, a scheduling algorithm will be deployed to self-interested parties, and recommendation algorithms will face strategic users. To design robust algorithms that work under non-stationary environments, a line of works focuses on the adversarial setting, where the reward function could be arbitrarily chosen by an adversary [Yu et al., 2009; Rosenberg and Mansour, 2019; Jin et al., 2020a; Chen et al., 2021; Luo et al., 2021a].

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Feb-14-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > San Diego County > San Diego (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China
    - Shanghai > Shanghai (0.04)
    - Hong Kong (0.04)
    - Guangdong Province > Shenzhen (0.04)

Genre:
- Research Report (0.50)

Industry:
- Transportation (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found