Deep Gamblers: Learning to Abstain with Portfolio Theory

Liu, Ziyin, Wang, Zhikang, Liang, Paul Pu, Salakhutdinov, Russ R., Morency, Louis-Philippe, Ueda, Masahito

Neural Information Processing Systems 

We deal with the selective classification problem (supervised-learning problem with a rejection option), where we want to achieve the best performance at a certain level of coverage of the data. We transform the original $m$-class classification problem to (m 1)-class where the (m 1)-th class represents the model abstaining from making a prediction due to disconfidence. Inspired by portfolio theory, we propose a loss function for the selective classification problem based on the doubling rate of gambling. Minimizing this loss function corresponds naturally to maximizing the return of a horse race, where a player aims to balance between betting on an outcome (making a prediction) when confident and reserving one's winnings (abstaining) when not confident. This loss function allows us to train neural networks and characterize the disconfidence of prediction in an end-to-end fashion.