Stochastic convex optimization with bandit feedback

Agarwal, Alekh, Foster, Dean P., Hsu, Daniel J., Kakade, Sham M., Rakhlin, Alexander

Dec-31-2011–Neural Information Processing Systems

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value $f(x)$ at any query point $x \in X$. We demonstrate a generalization of the ellipsoid algorithm that incurs $O(\poly(d)\sqrt{T})$ regret. Since any algorithm has regret at least $\Omega(\sqrt{T})$ on this problem, our algorithm is optimal in terms of the scaling with $T$.

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Dec-31-2011

Conferences PDF

Add feedback

Genre:
- Research Report (0.46)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (0.47)

Duplicate Docs Excel Report

Title
Stochastic convex optimization with bandit feedback

Similar Docs Excel Report more

Title	Similarity	Source
None found