Goto

Collaborating Authors

 margin principle




Ranking with Large Margin Principle: Two Approaches

Neural Information Processing Systems

We discuss the problem of ranking k instances with the use of a "large margin" principle. We introduce two main approaches: the first is the "fixed margin" policy in which the margin of the closest neighboring classes is being maximized - which turns out to be a direct generaliza(cid:173) tion of SVM to ranking learning. The second approach allows for k - 1 different margins where the sum of margins is maximized. This approach is shown to reduce to lI-SVM when the number of classes k 2. Both approaches are optimal in size of 21 where I is the total number of training examples. Experiments performed on visual classification and "collab(cid:173) orative filtering" show that both approaches outperform existing ordinal regression algorithms applied for ranking and multi-class SVM applied to general multi-class classification.


Margin-Based Regularization and Selective Sampling in Deep Neural Networks

arXiv.org Artificial Intelligence

We derive a new margin-based regularization formulation, termed multi-margin regularization (MMR), for deep neural networks (DNNs). The MMR is inspired by principles that were applied in margin analysis of shallow linear classifiers, e.g., support vector machine (SVM). Unlike SVM, MMR is continuously scaled by the radius of the bounding sphere (i.e., the maximal norm of the feature vector in the data), which is constantly changing during training. We empirically demonstrate that by a simple supplement to the loss function, our method achieves better results on various classification tasks across domains. Using the same concept, we also derive a selective sampling scheme and demonstrate accelerated training of DNNs by selecting samples according to a minimal margin score (MMS). This score measures the minimal amount of displacement an input should undergo until its predicted classification is switched. We evaluate our proposed methods on three image classification tasks and six language text classification tasks. Specifically, we show improved empirical results on CIFAR10, CIFAR100 and ImageNet using state-of-the-art convolutional neural networks (CNNs) and BERT-BASE architecture for the MNLI, QQP, QNLI, MRPC, SST-2 and RTE benchmarks.


Fast Polynomial Kernel Classification for Massive Data

arXiv.org Machine Learning

In the era of big data, it is highly desired to develop efficient machine learning algorithms to tackle massive data challenges such as storage bottleneck, algorithmic scalability, and interpretability. In this paper, we develop a novel efficient classification algorithm, called fast polynomial kernel classification (FPC), to conquer the scalability and storage challenges. Our main tools are a suitable selected feature mapping based on polynomial kernels and an alternating direction method of multipliers (ADMM) algorithm for a related non-smooth convex optimization problem. Fast learning rates as well as feasibility verifications including the convergence of ADMM and the selection of center points are established to justify theoretical behaviors of FPC. Our theoretical assertions are verified by a series of simulations and real data applications. The numerical results demonstrate that FPC significantly reduces the computational burden and storage memory of the existing learning schemes such as support vector machines and boosting, without sacrificing their generalization abilities much.


Large Margin Few-Shot Learning

arXiv.org Machine Learning

The key issue of few-shot learning is learning to generalize. In this paper, we propose a large margin principle to improve the generalization capacity of metric based methods for few-shot learning. To realize it, we develop a unified framework to learn a more discriminative metric space by augmenting the softmax classification loss function with a large margin distance loss function for training. Extensive experiments on two state-of-the-art few-shot learning models, graph neural networks and prototypical networks, show that our method can improve the performance of existing models substantially with very little computational overhead, demonstrating the effectiveness of the large margin principle and the potential of our method.


Ranking with Large Margin Principle: Two Approaches

Neural Information Processing Systems

We discuss the problem of ranking k instances with the use of a "large margin" principle. We introduce two main approaches: the first is the "fixed margin" policy in which the margin of the closest neighboring classes is being maximized - which turns out to be a direct generalization of SVM to ranking learning. The second approach allows for k - 1 different margins where the sum of margins is maximized. This approach is shown to reduce to lI-SVM when the number of classes k 2. Both approaches are optimal in size of 21 where I is the total number of training examples. Experiments performed on visual classification and "collaborative filtering" show that both approaches outperform existing ordinal regression algorithms applied for ranking and multi-class SVM applied to general multi-class classification.


Ranking with Large Margin Principle: Two Approaches

Neural Information Processing Systems

We discuss the problem of ranking k instances with the use of a "large margin" principle. We introduce two main approaches: the first is the "fixed margin" policy in which the margin of the closest neighboring classes is being maximized - which turns out to be a direct generalization of SVM to ranking learning. The second approach allows for k - 1 different margins where the sum of margins is maximized. This approach is shown to reduce to lI-SVM when the number of classes k 2. Both approaches are optimal in size of 21 where I is the total number of training examples. Experiments performed on visual classification and "collaborative filtering" show that both approaches outperform existing ordinal regression algorithms applied for ranking and multi-class SVM applied to general multi-class classification.


Ranking with Large Margin Principle: Two Approaches

Neural Information Processing Systems

We discuss the problem of ranking k instances with the use of a "large margin" principle. We introduce two main approaches: the first is the "fixed margin" policy in which the margin of the closest neighboring classes is being maximized - which turns out to be a direct generalization ofSVM to ranking learning. The second approach allows for k - 1 different margins where the sum of margins is maximized. This approach is shown to reduce to lI-SVM when the number of classes k 2. Both approaches are optimal in size of 21 where I is the total number of training examples. Experiments performed on visual classification and "collaborative filtering"show that both approaches outperform existing ordinal regression algorithms applied for ranking and multi-class SVM applied to general multi-class classification. 1 Introduction In this paper we investigate the problem of inductive learning from the point of view of predicting variables of ordinal scale [3, 7,5], a setting referred to as ranking learning or ordinal regression. We consider the problem of applying the large margin principle used in Support Vector methods [12, 1] to the ordinal regression problem while maintaining an (optimal) problem size linear in the number of training examples.