Minimax Rate-Optimal Algorithms for High-Dimensional Stochastic Linear Bandits

May-26-2025–arXiv.org Machine Learning

We study the stochastic linear bandit problem with multiple arms over $T$ rounds, where the covariate dimension $d$ may exceed $T$, but each arm-specific parameter vector is $s$-sparse. We begin by analyzing the sequential estimation problem in the single-arm setting, focusing on cumulative mean-squared error. We show that Lasso estimators are provably suboptimal in the sequential setting, exhibiting suboptimal dependence on $d$ and $T$, whereas thresholded Lasso estimators -- obtained by applying least squares to the support selected by thresholding an initial Lasso estimator -- achieve the minimax rate. Building on these insights, we consider the full linear contextual bandit problem and propose a three-stage arm selection algorithm that uses thresholded Lasso as the main estimation method. We derive an upper bound on the cumulative regret of order $s(\log s)(\log d + \log T)$, and establish a matching lower bound up to a $\log s$ factor, thereby characterizing the minimax regret rate up to a logarithmic term in $s$. Moreover, when a short initial period is excluded from the regret, the proposed algorithm achieves exact minimax optimality.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

May-26-2025

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - Scotland > City of Edinburgh
    - Edinburgh (0.04)
  - England > Cambridgeshire
    - Cambridge (0.04)

Genre:
- Research Report > New Finding (0.45)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.86)
  - Artificial Intelligence
    - Representation & Reasoning > Search (1.00)
    - Machine Learning > Statistical Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found