AITopics | average top-k loss

Collaborating Authors

average top-k loss

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning with Average Top-k Loss

Neural Information Processing SystemsNov-21-2025, 15:33:31 GMT

In this work, we introduce the average top-$k$ (\atk) loss as a new ensemble loss for supervised learning. The \atk loss provides a natural generalization of the two widely used ensemble losses, namely the average loss and the maximum loss. Furthermore, the \atk loss combines the advantages of them and can alleviate their corresponding drawbacks to better adapt to different data distributions. We show that the \atk loss affords an intuitive interpretation that reduces the penalty of continuous and convex individual losses on correctly classified data. The \atk loss can lead to convex optimization problems that can be solved effectively with conventional sub-gradient based method. We further study the Statistical Learning Theory of \matk by establishing its classification calibration and statistical consistency of \matk which provide useful insights on the practical choice of the parameter $k$. We demonstrate the applicability of \matk learning combined with different individual loss functions for binary and multi-class classification and regression using synthetic and real datasets.

atk loss, average top-k loss, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.98)

Add feedback

Reviews: Learning with Average Top-k Loss

Neural Information Processing SystemsOct-7-2024, 23:48:32 GMT

This paper investigates a new learning setting: optimizing the average k largest (top-k) individual functions for supervised learning. This setting is different from the standard Empirical Risk minimization (ERM), which optimize the average loss function over datasets. The proposed setting is also different from maximum loss (Shalev-Shwartz and Wexler 2016), which optimize the maximum loss. This paper tries to optimize the average top-k loss functions. This can be viewed as a natural generalization of the ERM and the maximum loss.

algorithm, average top-k loss, shalev-shwartz and wexler 2016, (8 more...)

Neural Information Processing Systems

Genre:

Research Report (0.62)
Summary/Review (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning with Average Top-k Loss

Fan, Yanbo, Lyu, Siwei, Ying, Yiming, Hu, Baogang

Neural Information Processing SystemsFeb-14-2020, 05:43:43 GMT

atk loss, average top-k loss, learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.64)

Add feedback

A Stochastic First-Order Method for Ordered Empirical Risk Minimization

Kawaguchi, Kenji, Lu, Haihao

arXiv.org Machine LearningJul-9-2019

We propose a new stochastic first-order method for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an unbiased gradient estimator of the empirical average loss. In contrast, we develop a computationally efficient method to construct a gradient estimator that is purposely biased toward those observations with higher current losses, and that itself is an unbiased gradient estimator of an ordered modification of the empirical average loss. On the theory side, we show that the proposed algorithm is guaranteed to converge at a sublinear rate to a global optimum for convex loss and to a critical point for non-convex loss. Furthermore, we prove a new generalization bound for the proposed algorithm. On the empirical side, we present extensive numerical experiments, in which our proposed method consistently improves the test errors compared with the standard mini-batch SGD in various models including SVM, logistic regression, and (non-convex) deep learning problems.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

1907.04371

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Minimizing Close-k Aggregate Loss Improves Classification

He, Bryan, Zou, James

arXiv.org Artificial IntelligenceNov-3-2018

In classification, the de facto method for aggregating individual losses is the average loss. When the actual metric of interest is 0-1 loss, it is common to minimize the average surrogate loss for some well-behaved (e.g. convex) surrogate. Recently, several other aggregate losses such as the maximal loss and average top-$k$ loss were proposed as alternative objectives to address shortcomings of the average loss. However, we identify common classification settings, e.g. the data is imbalanced, has too many easy or ambiguous examples, etc., when average, maximal and average top-$k$ all suffer from suboptimal decision boundaries, even on an infinitely large training set. To address this problem, we propose a new classification objective called the close-$k$ aggregate loss, where we adaptively minimize the loss for points close to the decision boundary. We provide theoretical guarantees for the 0-1 accuracy when we optimize close-$k$ aggregate loss. We also conduct systematic experiments across the PMLB and OpenML benchmark datasets. Close-$k$ achieves significant gains in 0-1 test accuracy, improvements of $\geq 2$% and $p<0.05$, in over 25% of the datasets compared to average, maximal and average top-$k$. In contrast, the previous aggregate losses outperformed close-$k$ in less than 2% of the datasets.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1811.00521

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback