Goto

Collaborating Authors

 online multiclass classification



Beyond Bandit Feedback in Online Multiclass Classification

Neural Information Processing Systems

We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph. While including bandit feedback as a special case, feedback graphs allow a much richer set of applications, including filtering and label efficient classification.We introduce \textproc{Gappletron}, the first online multiclass algorithm that works with arbitrary feedback graphs. For this new algorithm,we prove surrogate regret bounds that hold, both in expectation and with high probability, for a large class of surrogate losses. Our bounds are of order $B\sqrt{\rho KT}$, where $B$ is the diameter of the prediction space, $K$ is the number of classes, $T$ is the time horizon, and $\rho$ is the domination number (a graph-theoretic parameter affecting the amount of exploration). In the full information case, we show that \textproc{Gappletron} achieves a constant surrogate regret of order $B^2K$. We also prove a general lower bound of order $\max\big\{B^2K,\sqrt{T}\big\}$ showing that our upper bounds are not significantly improvable. Experiments on synthetic data show that for various feedback graphs our algorithm is competitive against known baselines.


Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing Systems

In the full information setting we provide expected mistake bounds for \textproc{Gaptron} with respect to the logistic loss, hinge loss, and the smooth hinge loss with $O(K)$ regret, where the expectation is with respect to the learner's randomness and $K$ is the number of classes. In the bandit classification setting we show that \textproc{Gaptron} is the first linear time algorithm with $O(K\sqrt{T})$ expected regret. Additionally, the expected mistake bound of \textproc{Gaptron} does not depend on the dimension of the feature vector, contrary to previous algorithms with $O(K\sqrt{T})$ regret in the bandit classification setting. We present a new proof technique that exploits the gap between the zero-one loss and surrogate losses rather than exploiting properties such as exp-concavity or mixability, which are traditionally used to prove logarithmic or constant regret bounds.


Beyond Bandit Feedback in Online Multiclass Classification

Neural Information Processing Systems

We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph. While including bandit feedback as a special case, feedback graphs allow a much richer set of applications, including filtering and label efficient classification.


Bandit and Delayed Feedback in Online Structured Prediction

Shibukawa, Yuki, Tsuchiya, Taira, Sakaue, Shinsaku, Yamanishi, Kenji

arXiv.org Machine Learning

Online structured prediction is a task of sequentially predicting outputs with complex structures based on inputs and past observations, encompassing online classification. Recent studies showed that in the full information setup, we can achieve finite bounds on the surrogate regret, i.e., the extra target loss relative to the best possible surrogate loss. In practice, however, full information feedback is often unrealistic as it requires immediate access to the whole structure of complex outputs. Motivated by this, we propose algorithms that work with less demanding feedback, bandit and delayed feedback. For the bandit setting, using a standard inverse-weighted gradient estimator, we achieve a surrogate regret bound of $O(\sqrt{KT})$ for the time horizon $T$ and the size of the output set $K$. However, $K$ can be extremely large when outputs are highly complex, making this result less desirable. To address this, we propose an algorithm that achieves a surrogate regret bound of $O(T^{2/3})$, which is independent of $K$. This is enabled with a carefully designed pseudo-inverse matrix estimator. Furthermore, for the delayed full information feedback setup, we obtain a surrogate regret bound of $O(D^{2/3} T^{1/3})$ for the delay time $D$. We also provide algorithms for the delayed bandit feedback setup. Finally, we numerically evaluate the performance of the proposed algorithms in online classification with bandit feedback.


Review for NeurIPS paper: Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing Systems

Summary and Contributions: This paper considers online multiclass classification, in both the full-information and bandit settings, where the objective is to minimize regret with respect to the hinge loss (or logistic loss or smooth hinge loss) of the best linear predictor in hindsight. The loss of the online algorithm is 0/1-loss, so "bandit feedback" (only learning the loss of the action chosen) corresponds to only learning if the prediction made was correct or not. The paper presents a new algorithm that has a particularly good combination of running time and regret bound in the bandit setting. One thing I was looking for but couldn't find (maybe I missed it) is a discussion of what makes multiclass special. The gap between loss functions, e.g., as given in Figure 1, holds for binary too.


Review for NeurIPS paper: Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing Systems

The paper presents a new algorithm for a problem that is relevant for the conference. The paper is unclear in some places, however. Therefore, we strongly urge you to work to make the presentation simpler (e.g., is it possible to simplify the notation in some way to make the paper more "broadly accessible" as suggested by reviewer #2).


Beyond Bandit Feedback in Online Multiclass Classification

Neural Information Processing Systems

We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph. While including bandit feedback as a special case, feedback graphs allow a much richer set of applications, including filtering and label efficient classification.We introduce \textproc{Gappletron}, the first online multiclass algorithm that works with arbitrary feedback graphs. For this new algorithm,we prove surrogate regret bounds that hold, both in expectation and with high probability, for a large class of surrogate losses. Our bounds are of order B\sqrt{\rho KT}, where B is the diameter of the prediction space, K is the number of classes, T is the time horizon, and \rho is the domination number (a graph-theoretic parameter affecting the amount of exploration). In the full information case, we show that \textproc{Gappletron} achieves a constant surrogate regret of order B 2K .


Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing Systems

In the full information setting we provide expected mistake bounds for \textproc{Gaptron} with respect to the logistic loss, hinge loss, and the smooth hinge loss with O(K) regret, where the expectation is with respect to the learner's randomness and K is the number of classes. In the bandit classification setting we show that \textproc{Gaptron} is the first linear time algorithm with O(K\sqrt{T}) expected regret. Additionally, the expected mistake bound of \textproc{Gaptron} does not depend on the dimension of the feature vector, contrary to previous algorithms with O(K\sqrt{T}) regret in the bandit classification setting. We present a new proof technique that exploits the gap between the zero-one loss and surrogate losses rather than exploiting properties such as exp-concavity or mixability, which are traditionally used to prove logarithmic or constant regret bounds.

  gaptron, online multiclass classification, textproc, (5 more...)
  Industry: Education > Educational Setting > Online (0.97)

The Complexity of Sequential Prediction in Dynamical Systems

Raman, Vinod, Subedi, Unique, Tewari, Ambuj

arXiv.org Artificial Intelligence

A discrete-time dynamical system is a mathematical model that describes the evolution of a system over discrete time steps. Formally, a discrete-time dynamical system is a tuple (N, X, f), where N is the set of natural numbers that denote the timesteps, X is a non-empty set called the state space, and f: X X is a deterministic map that describes the evolution of the state. Dynamical systems have been widely used in practice due to their ability to accurately model natural phenomena. For instance, boolean networks are an important class of discrete-time, discrete-space dynamical systems with widespread applicability to genetic modeling [Kauffman, 1969, Shmulevich et al., 2002].