Online Bandit Learning for a Special Class of Non-Convex Losses

Zhang, Lijun (Nanjing University) | Yang, Tianbao (The University of Iowa) | Jin, Rong (Michigan State University) | Zhou, Zhi-Hua (Nanjing University)

Mar-6-2015–AAAI Conferences

In online bandit learning, the learner aims to minimize a sequence of losses, while only observing the value of each loss at a single point. Although various algorithms and theories have been developed for online bandit learning, most of them are limited to convex losses. In this paper, we investigate the problem of online bandit learning with non-convex losses, and develop an efficient algorithm with formal theoretical guarantees. To be specific, we consider a class of losses which is a composition of a non-increasing scalar function and a linear function. This setting models a wide range of supervised learning applications such as online classification with a non-convex loss. Theoretical analysis shows that our algorithm achieves an O(poly(d)T2/3) regret bound when the variation of the loss function is small. To the best of our knowledge, this is the first work in online bandit learning that does not rely on convexity.

algorithm, artificial intelligence, machine learning, (17 more...)

AAAI Conferences

Mar-6-2015

Conferences PDF

Add feedback

Country:
- North America > United States
  - Michigan > Ingham County
    - Lansing (0.04)
    - East Lansing (0.04)
  - Iowa > Johnson County
    - Iowa City (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > China
  - Jiangsu Province > Nanjing (0.04)

Industry:
- Education
  - Educational Setting > Online (0.87)
  - Educational Technology > Educational Software
    - Computer Based Training (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found