AITopics | multiclass

While the optimal sample complexity of binary classification in terms of the VC dimension is well-established, determining the optimal sample complexity of multiclass classification has remained open. The appropriate complexity parameter for multiclass classification is the DS dimension, and despite significant efforts, a gap of $\sqrt{\text{DS}}$ has persisted between the upper and lower bounds on sample complexity. Recent work by Hanneke et al. (2026) shows a novel algebraic characterization of multiclass hypothesis classes in terms of their DS dimension. Building up on this, we show that the maximum hypergraph density of any multiclass hypothesis class is upper-bounded by its DS dimension. This proves a longstanding conjecture of Daniely and Shalev-Shwartz (2014). As a consequence, we determine the optimal dependence of the sample complexity on the DS dimension for multiclass as well as list learning.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Machine Learning

2604.24749

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback

Multiclass Boosting: Simple and Intuitive Weak Learning Criteria

Neural Information Processing SystemsApr-24-2026, 07:17:08 GMT

We study a generalization of boosting to the multiclass setting. We introduce a weak learning condition for multiclass classification that captures the original notion of weak learnability as being "slightly better than random guessing". We give a simple and efficient boosting algorithm, that does not require realizability assumptions and its sample and oracle complexity bounds are independent of the number of classes. In addition, we utilize our new boosting technique in several theoretical applications within the context of List PACLearning. First, we establish an equivalence to weak PAC learning. Furthermore, we present a new result on boosting for list learners, as well as provide a novel proof for the characterization of multiclass PAC learning and List PAC learning. Notably, our technique gives rise to a simplified analysis, and also implies an improved error bound for large list sizes, compared to previous results.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.89)

Add feedback

050f8591be3874b52fdac4e1060eeb29-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:17:04 GMT

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Large Margin Discriminant Dimensionality Reduction in Prediction Space

Mohammad Saberian, Jose Costa Pereira, Nuno Nvasconcelos, Can Xu

Neural Information Processing SystemsMar-23-2026, 11:02:50 GMT

In this paper we establish a duality between boosting and SVM, and use this to derive a novel discriminant dimensionality reduction algorithm. In particular, using the multiclass formulation of boosting and SVM we note that both use a combination of mapping and linear classification to maximize the multiclass margin. In SVM this is implemented using a pre-defined mapping (induced by the kernel) and optimizing the linear classifiers. In boosting the linear classifiers are pre-defined and the mapping (predictor) is learned through a combination of weak learners. We argue that the intermediate mapping, i.e. boosting predictor, is preserving the discriminant aspects of the data and that by controlling the dimension of this mapping it is possible to obtain discriminant low dimensional representations for the data. We use the aforementioned duality and propose a new method, Large Margin Discriminant Dimensionality Reduction (LADDER) that jointly learns the mapping and the linear classifiers in an efficient manner. This leads to a data-driven mapping which can embed data into any number of dimensions. Experimental results show that this embedding can significantly improve performance on tasks such as hashing and image/scene classification.

artificial intelligence, codeword, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.83)

Add feedback

050f8591be3874b52fdac4e1060eeb29-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 21:54:26 GMT

We study a generalization of boosting to the multiclass setting. We introduce a weak learning condition for multiclass classification that captures the original notion ofweak learnability asbeing "slightly better than random guessing".

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

050f8591be3874b52fdac4e1060eeb29-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 21:54:23 GMT

algorithm, learner, weak learner, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

The Implicit Bias of Gradient Descent on Separable Multiclass Data

Neural Information Processing SystemsFeb-16-2026, 17:58:57 GMT

However, there is a noticeable gap in analysis for multiclass classification, with only a handful of results which themselves are restricted to the cross-entropy loss.

artificial intelligence, machine learning, vec, (18 more...)

Neural Information Processing Systems

Country: