Selective Reviews of Bandit Problems in AI via a Statistical View

Zhou, Pengjie, Wei, Haoyu, Zhang, Huiming

Dec-3-2024–arXiv.org Machine Learning

Introduction Reinforcement Learning (RL) is one of the most prominent and widely discussed methods in artificial intelligence, primarily focusing on how an agent learns to make decisions by interacting with an environment to maximize cumulative rewards [1]. RL has seen extensive applications in various domains, including autonomous driving [2], recommendation systems [3], unmanned aerial vehicles (UAVs) [4], financial trading [5], causal inference [6], and precision medicine [7,8]; see [9,10] for a review. The classic and simplified problem in RL is the stochastic bandit problems. Stochastic bandit problems exemplify the exploration-exploitation tradeoff dilemma, where an agent must choose between exploring new options to gather more information and exploiting known options to maximize rewards. The current review literature on stochastic bandit algorithms highlights applications in areas such as recommendation systems[11-13], experimental design[14], and precision medicine[8], causal inference[15]. Efficient bandit algorithms are designed from a statistical perspective. However, these aspects remain underexplored in existing reviews. This paper aims to address this gap by focusing on the probabilistic and statistical foundations of stochastic algorithms, with particular emphasis on concentration inequalities, minimax rate of regret upper bounds, small-sample statistical inferences, linear models, Bayesian optimization, statistical learning theory, design of experiments, the Neyman-Rubin causal model, functional data analysis, robust statistics, information theory, and so on.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Machine Learning

Dec-3-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.14)
- North America > United States
  - California (0.28)

Genre:
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Instructional Material > Course Syllabus & Notes (0.93)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.97)
- Information Technology (0.85)
- Energy > Oil & Gas
  - Upstream (0.66)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (1.00)
  - Artificial Intelligence
    - Machine Learning > Statistical Learning (1.00)
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.45)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found