Goto

Collaborating Authors

 minimax probability machine


Minimax Probability Machine

Neural Information Processing Systems

When constructing a classifier, the probability of correct classifi(cid:173) cation of future data points should be maximized. In the current paper this desideratum is translated in a very direct way into an optimization problem, which is solved using methods from con(cid:173) vex optimization. We also show how to exploit Mercer kernels in this setting to obtain nonlinear decision boundaries. A worst-case bound on the probability of misclassification of future data is ob(cid:173) tained explicitly.


A Minimax Probability Machine for Non-Decomposable Performance Measures

arXiv.org Machine Learning

Imbalanced classification tasks are widespread in many real-world applications. For such classification tasks, in comparison with the accuracy rate, it is usually much more appropriate to use non-decomposable performance measures such as the Area Under the receiver operating characteristic Curve (AUC) and the $F_\beta$ measure as the classification criterion since the label class is imbalanced. On the other hand, the minimax probability machine is a popular method for binary classification problems and aims at learning a linear classifier by maximizing the accuracy rate, which makes it unsuitable to deal with imbalanced classification tasks. The purpose of this paper is to develop a new minimax probability machine for the $F_\beta$ measure, called MPMF, which can be used to deal with imbalanced classification tasks. A brief discussion is also given on how to extend the MPMF model for several other non-decomposable performance measures listed in the paper. To solve the MPMF model effectively, we derive its equivalent form which can then be solved by an alternating descent method to learn a linear classifier. Further, the kernel trick is employed to derive a nonlinear MPMF model to learn a nonlinear classifier. Several experiments on real-world benchmark datasets demonstrate the effectiveness of our new model.


Deep Minimax Probability Machine

arXiv.org Machine Learning

--Deep neural networks enjoy a powerful representation and have proven effective in a number of applications. However, recent advances show that deep neural networks are vulnerable to adversarial attacks incurred by the so-called adversarial examples. Although the adversarial example is only slightly different from the input sample, the neural network classifies it as the wrong class. In order to alleviate this problem, we propose the Deep Minimax Probability Machine (DeepMPM), which applies MPM to deep neural networks in an end-to-end fashion. In a worst-case scenario, MPM tries to minimize an upper bound of misclassification probabilities, considering the global information (i.e., mean and covariance information of each class). DeepMPM can be more robust since it learns the worst-case bound on the probability of misclassification of future data. Experiments on two real-world datasets can achieve comparable classification performance with CNN, while can be more robust on adversarial attacks.


Minimax Probability Machine

Neural Information Processing Systems

One way to attempt to achieve this is via a generative approach in which one makes distributional assumptions about the class-conditional densities and thereby estimates and controls the relevant probabilities. The need to make distributional assumptions, however, casts doubt on the generality and validity of such an approach, and in discriminative solutions to classification problems it is common to attempt to dispense with class-conditional densities entirely. Rather than avoiding any reference to class-conditional densities, it might be useful to attempt to control misclassification probabilities in a worst-case setting; that is, under all possible choices of class-conditional densities. Such a minimax approach could be viewed as providing an alternative justification for discriminative approaches. In this paper we show how such a minimax programme can be carried out in the setting of binary classification. Our approach involves exploiting the following powerful theorem due to Isii [6], as extended in recent work by Bertsimas - http://robotics.eecs.berkeley.edur


Minimax Probability Machine

Neural Information Processing Systems

One way to attempt to achieve this is via a generative approach in which one makes distributional assumptions about the class-conditional densities and thereby estimates and controls the relevant probabilities. The need to make distributional assumptions, however, casts doubt on the generality and validity of such an approach, and in discriminative solutions to classification problems it is common to attempt to dispense with class-conditional densities entirely. Rather than avoiding any reference to class-conditional densities, it might be useful to attempt to control misclassification probabilities in a worst-case setting; that is, under all possible choices of class-conditional densities. Such a minimax approach could be viewed as providing an alternative justification for discriminative approaches. In this paper we show how such a minimax programme can be carried out in the setting of binary classification. Our approach involves exploiting the following powerful theorem due to Isii [6], as extended in recent work by Bertsimas - http://robotics.eecs.berkeley.edur


Minimax Probability Machine

Neural Information Processing Systems

When constructing a classifier, the probability of correct classification offuture data points should be maximized. In the current paper this desideratum is translated in a very direct way into an optimization problem, which is solved using methods from convex optimization.We also show how to exploit Mercer kernels in this setting to obtain nonlinear decision boundaries. A worst-case bound on the probability of misclassification of future data is obtained explicitly. 1 Introduction Consider the problem of choosing a linear discriminant by minimizing the probabilities thatdata vectors fall on the wrong side of the boundary. One way to attempt to achieve this is via a generative approach in which one makes distributional assumptions aboutthe class-conditional densities and thereby estimates and controls the relevant probabilities. The need to make distributional assumptions, however, casts doubt on the generality and validity of such an approach, and in discriminative solutionsto classification problems it is common to attempt to dispense with class-conditional densities entirely.