mot problem
An Optimal Transport Approach for Computing Adversarial Training Lower Bounds in Multiclass Classification
Trillos, Nicolas Garcia, Jacobs, Matt, Kim, Jakwang, Werenski, Matthew
Despite the success of deep learning-based algorithms, it is widely known that neural networks may fail to be robust. A popular paradigm to enforce robustness is adversarial training (AT), however, this introduces many computational and theoretical difficulties. Recent works have developed a connection between AT in the multiclass classification setting and multimarginal optimal transport (MOT), unlocking a new set of tools to study this problem. In this paper, we leverage the MOT connection to propose computationally tractable numerical algorithms for computing universal lower bounds on the optimal adversarial risk and identifying optimal classifiers. We propose two main algorithms based on linear programming (LP) and entropic regularization (Sinkhorn). Our key insight is that one can harmlessly truncate the higher order interactions between classes, preventing the combinatorial run times typically encountered in MOT problems. We validate these results with experiments on MNIST and CIFAR-$10$, which demonstrate the tractability of our approach.
The Multimarginal Optimal Transport Formulation of Adversarial Multiclass Classification
Trillos, Nicolas Garcia, Jacobs, Matt, Kim, Jakwang
We study a family of adversarial multiclass classification problems and provide equivalent reformulations in terms of: 1) a family of generalized barycenter problems introduced in the paper and 2) a family of multimarginal optimal transport problems where the number of marginals is equal to the number of classes in the original classification problem. These new theoretical results reveal a rich geometric structure of adversarial learning problems in multiclass classification and extend recent results restricted to the binary classification setting. A direct computational implication of our results is that by solving either the barycenter problem and its dual, or the MOT problem and its dual, we can recover the optimal robust classification rule and the optimal adversarial strategy for the original adversarial problem. Examples with synthetic and real data illustrate our results.
Polynomial-time algorithms for Multimarginal Optimal Transport problems with structure
Altschuler, Jason M., Boix-Adsera, Enric
Multimarginal Optimal Transport (MOT) has attracted significant interest due to applications in machine learning, statistics, and the sciences. However, in most applications, the success of MOT is severely limited by a lack of efficient algorithms. Indeed, MOT in general requires exponential time in the number of marginals k and their support sizes n. This paper develops a general theory about what "structure" makes MOT solvable in poly(n,k) time. We develop a unified algorithmic framework for solving MOT in poly(n,k) time by characterizing the "structure" that different algorithms require in terms of simple variants of the dual feasibility oracle. This framework has several benefits. First, it enables us to show that the Sinkhorn algorithm, which is currently the most popular MOT algorithm, requires strictly more structure than other algorithms do to solve MOT in poly(n,k) time. Second, our framework makes it much simpler to develop poly(n,k) time algorithms for a given MOT problem. In particular, it is necessary and sufficient to (approximately) solve the dual feasibility oracle -- which is much more amenable to standard algorithmic techniques. We illustrate this ease-of-use by developing poly(n,k) time algorithms for three general classes of MOT cost structures: (1) graphical structure; (2) set-optimization structure; and (3) low-rank plus sparse structure. For structure (1), we recover the known result that Sinkhorn has poly(n,k) runtime; moreover, we provide the first poly(n,k) time algorithms for computing solutions that are exact and sparse. For structures (2)-(3), we give the first poly(n,k) time algorithms, even for approximate computation. Together, these three structures encompass many -- if not most -- current applications of MOT.