Mohri, Christopher
Cardinality-Aware Set Prediction and Top-$k$ Classification
Cortes, Corinna, Mao, Anqi, Mohri, Christopher, Mohri, Mehryar, Zhong, Yutao
We present a detailed study of cardinality-aware top-$k$ classification, a novel approach that aims to learn an accurate top-$k$ set predictor while maintaining a low cardinality. We introduce a new target loss function tailored to this setting that accounts for both the classification error and the cardinality of the set predicted. To optimize this loss function, we propose two families of surrogate losses: cost-sensitive comp-sum losses and cost-sensitive constrained losses. Minimizing these loss functions leads to new cardinality-aware algorithms that we describe in detail in the case of both top-$k$ and threshold-based classifiers. We establish $H$-consistency bounds for our cardinality-aware surrogate loss functions, thereby providing a strong theoretical foundation for our algorithms. We report the results of extensive experiments on CIFAR-10, CIFAR-100, ImageNet, and SVHN datasets demonstrating the effectiveness and benefits of our cardinality-aware algorithms.
Learning to Reject with a Fixed Predictor: Application to Decontextualization
Mohri, Christopher, Andor, Daniel, Choi, Eunsol, Collins, Michael
Large language models, often trained with billions of parameters, have achieved impressive performance in recent years (Raffel et al., 2019) and are used in a wide variety of natural language generation tasks. However, their output is sometimes undesirable, with hallucinated content (Maynez et al., 2020; Filippova, 2020), and much work remains to fully understand their properties. In many applications, such as healthcare, question-answering systems, or customer service, incorrect predictions are particularly costly and must be avoided. This motivates the design of algorithms for large language models and other NLP tasks that achieve high precision on a large fraction of the input set, while abstaining on the rest. How can we devise such accurate models that allow a reject option?
Online Learning Algorithms for Statistical Arbitrage
Mohri, Christopher
Arbitrage is the risk-free method of making profit from exploiting price differences in different markets. For example, if one stock is trading at a higher price in one market than another, one could buy the stock for the lower price on one market and sell it for the higher price on the other, thereby making profit without taking risks. These pricing disparities have become increasingly hard to capitalize on as they only appear for very short periods of time with the advancements in technology and highfrequency trading. Only those who can recognize and take advantage of arbitrage opportunities first can benefit, turning it into a winner-takes-all situation. This has made it difficult to make consistent profit from price discrepancies, as one needs to recognize them quickly and be the first to leverage them.