Discriminative Densities from Maximum Contrast Estimation

Neural Information Processing Systems 

We propose a framework for classifier design based on discriminative densities for representation of the differences of the class-conditional dis- tributions in a way that is optimal for classification. The densities are selected from a parametrized set by constrained maximization of some objective function which measures the average (bounded) difference, i.e. the contrast between discriminative densities. We show that maximiza- tion of the contrast is equivalent to minimization of an approximation of the Bayes risk. In particular for a certain parametrization of the density functions we obtain MCCs which have the same functional form as the well-known Support Vec- tor Machines (SVMs). We show that MCC-training in general requires some nonlinear optimization but under certain conditions the problem is concave and can be tackled by a single linear program.