AITopics | surrogate gap

Collaborating Authors

surrogate gap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6ce8d8f3b038f737cefcdafcf3752452-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 19:30:04 GMT

lemma 1, log 2, surrogate gap, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing SystemsDec-24-2025, 03:56:48 GMT

In the full information setting we provide expected mistake bounds for \textproc{Gaptron} with respect to the logistic loss, hinge loss, and the smooth hinge loss with $O(K)$ regret, where the expectation is with respect to the learner's randomness and $K$ is the number of classes. In the bandit classification setting we show that \textproc{Gaptron} is the first linear time algorithm with $O(K\sqrt{T})$ expected regret. Additionally, the expected mistake bound of \textproc{Gaptron} does not depend on the dimension of the feature vector, contrary to previous algorithms with $O(K\sqrt{T})$ regret in the bandit classification setting. We present a new proof technique that exploits the gap between the zero-one loss and surrogate losses rather than exploiting properties such as exp-concavity or mixability, which are traditionally used to prove logarithmic or constant regret bounds.

name change, online multiclass classification, surrogate gap, (8 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Non-Stationary Online Structured Prediction with Surrogate Losses

Sakaue, Shinsaku, Bao, Han, Cao, Yuzhou

arXiv.org Artificial IntelligenceOct-9-2025

Online structured prediction, including online classification as a special case, is the task of sequentially predicting labels from input features. Therein the surrogate regret -- the cumulative excess of the target loss (e.g., 0-1 loss) over the surrogate loss (e.g., logistic loss) of the fixed best estimator -- has gained attention, particularly because it often admits a finite bound independent of the time horizon $T$. However, such guarantees break down in non-stationary environments, where every fixed estimator may incur the surrogate loss growing linearly with $T$. We address this by proving a bound of the form $F_T + C(1 + P_T)$ on the cumulative target loss, where $F_T$ is the cumulative surrogate loss of any comparator sequence, $P_T$ is its path length, and $C > 0$ is some constant. This bound depends on $T$ only through $F_T$ and $P_T$, often yielding much stronger guarantees in non-stationary environments. Our core idea is to synthesize the dynamic regret bound of the online gradient descent (OGD) with the technique of exploiting the surrogate gap. Our analysis also sheds light on a new Polyak-style learning rate for OGD, which systematically offers target-loss guarantees and exhibits promising empirical performance. We further extend our approach to a broader class of problems via the convolutional Fenchel--Young loss. Finally, we prove a lower bound showing that the dependence on $F_T$ and $P_T$ is tight.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.07086

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

6ce8d8f3b038f737cefcdafcf3752452-Supplemental.pdf

Neural Information Processing SystemsOct-3-2025, 04:26:33 GMT

artificial intelligence, log 2, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Review for NeurIPS paper: Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing SystemsJan-25-2025, 11:12:15 GMT

Summary and Contributions: This paper considers online multiclass classification, in both the full-information and bandit settings, where the objective is to minimize regret with respect to the hinge loss (or logistic loss or smooth hinge loss) of the best linear predictor in hindsight. The loss of the online algorithm is 0/1-loss, so "bandit feedback" (only learning the loss of the action chosen) corresponds to only learning if the prediction made was correct or not. The paper presents a new algorithm that has a particularly good combination of running time and regret bound in the bandit setting. One thing I was looking for but couldn't find (maybe I missed it) is a discussion of what makes multiclass special. The gap between loss functions, e.g., as given in Figure 1, holds for binary too.

algorithm, neurips paper, online multiclass classification, (4 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing SystemsJan-25-2025, 11:12:08 GMT

The paper presents a new algorithm for a problem that is relevant for the conference. The paper is unclear in some places, however. Therefore, we strongly urge you to work to make the presentation simpler (e.g., is it possible to simplify the notation in some way to make the paper more "broadly accessible" as suggested by reviewer #2).

neurips paper, online multiclass classification, surrogate gap, (1 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Meta Curvature-Aware Minimization for Domain Generalization

Chen, Ziyang, Ye, Yiwen, Tang, Feilong, Pan, Yongsheng, Xia, Yong

arXiv.org Artificial IntelligenceDec-16-2024

Domain generalization (DG) aims to enhance the ability of models trained on source domains to generalize effectively to unseen domains. Recently, Sharpness-Aware Minimization (SAM) has shown promise in this area by reducing the sharpness of the loss landscape to obtain more generalized models. However, SAM and its variants sometimes fail to guide the model toward a flat minimum, and their training processes exhibit limitations, hindering further improvements in model generalization. In this paper, we first propose an improved model training process aimed at encouraging the model to converge to a flat minima. To achieve this, we design a curvature metric that has a minimal effect when the model is far from convergence but becomes increasingly influential in indicating the curvature of the minima as the model approaches a local minimum. Then we derive a novel algorithm from this metric, called Meta Curvature-Aware Minimization (MeCAM), to minimize the curvature around the local minima. Specifically, the optimization objective of MeCAM simultaneously minimizes the regular training loss, the surrogate gap of SAM, and the surrogate gap of meta-learning. We provide theoretical analysis on MeCAM's generalization error and convergence rate, and demonstrate its superiority over existing DG methods through extensive experiments on five benchmark DG datasets, including PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet. Code will be available on GitHub.

artificial intelligence, generalization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.11542

Country:

Oceania > Australia (0.04)
Asia > China > Zhejiang Province > Ningbo (0.04)
Asia > China > Jiangsu Province > Yancheng (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Exploiting the Surrogate Gap in Online Multiclass Classification

Neural Information Processing SystemsOct-10-2024, 11:22:48 GMT

In the full information setting we provide expected mistake bounds for \textproc{Gaptron} with respect to the logistic loss, hinge loss, and the smooth hinge loss with O(K) regret, where the expectation is with respect to the learner's randomness and K is the number of classes. In the bandit classification setting we show that \textproc{Gaptron} is the first linear time algorithm with O(K\sqrt{T}) expected regret. Additionally, the expected mistake bound of \textproc{Gaptron} does not depend on the dimension of the feature vector, contrary to previous algorithms with O(K\sqrt{T}) regret in the bandit classification setting. We present a new proof technique that exploits the gap between the zero-one loss and surrogate losses rather than exploiting properties such as exp-concavity or mixability, which are traditionally used to prove logarithmic or constant regret bounds.

gaptron, online multiclass classification, textproc, (5 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Structured Prediction with Fenchel--Young Losses and Improved Surrogate Regret for Online Multiclass Classification with Logistic Loss

Sakaue, Shinsaku, Bao, Han, Tsuchiya, Taira, Oki, Taihei

arXiv.org Artificial IntelligenceFeb-12-2024

This paper studies online structured prediction with full-information feedback. For online multiclass classification, van der Hoeven (2020) has obtained surrogate regret bounds independent of the time horizon, or \emph{finite}, by introducing an elegant \emph{exploit-the-surrogate-gap} framework. However, this framework has been limited to multiclass classification primarily because it relies on a classification-specific procedure for converting estimated scores to outputs. We extend the exploit-the-surrogate-gap framework to online structured prediction with \emph{Fenchel--Young losses}, a large family of surrogate losses including the logistic loss for multiclass classification, obtaining finite surrogate regret bounds in various structured prediction problems. To this end, we propose and analyze \emph{randomized decoding}, which converts estimated scores to general structured outputs. Moreover, by applying our decoding to online multiclass classification with the logistic loss, we obtain a surrogate regret bound of $O(B^2)$, where $B$ is the $\ell_2$-diameter of the domain. This bound is tight up to logarithmic factors and improves the previous bound of $O(dB^2)$ due to van der Hoeven (2020) by a factor of $d$, the number of classes.

classification, surrogate regret, target loss, (15 more...)

arXiv.org Artificial Intelligence

2402.0818

Country: