Optimal Binary Classifier Aggregation for General Losses

Akshay Balsubramani, Yoav S. Freund

Neural Information Processing Systems 

We address the problem of aggregating an ensemble of predictors with known loss bounds in a semi-supervised binary classification setting, to minimize prediction loss incurred on the unlabeled data. We find the minimax optimal predictions for a very general class of loss functions including all convex and many non-convex losses, extending a recent analysis of the problem for misclassification error. The result is a family of semi-supervised ensemble aggregation algorithms which are as efficient as linear learning by convex optimization, but are minimax optimal without any relaxations. Their decision rules take a form familiar in decision theory - applying sigmoid functions to a notion of ensemble margin - without the assumptions typically made in margin-based learning.