Potluru, Vamsi
Mixup Regularization: A Probabilistic Perspective
El-Laham, Yousef, Dalmasso, Niccolo, Vyetrenko, Svitlana, Potluru, Vamsi, Veloso, Manuela
In recent years, mixup regularization has gained popularity as an effective way to improve the generalization performance of deep learning models by training on convex combinations of training data. While many mixup variants have been explored, the proper adoption of the technique to conditional density estimation and probabilistic machine learning remains relatively unexplored. This work introduces a novel framework for mixup regularization based on probabilistic fusion that is better suited for conditional density estimation tasks. For data distributed according to a member of the exponential family, we show that likelihood functions can be analytically fused using log-linear pooling. We further propose an extension of probabilistic mixup, which allows for fusion of inputs at an arbitrary intermediate layer of the neural network. We provide a theoretical analysis comparing our approach to standard mixup variants. Empirical results on synthetic and real datasets demonstrate the benefits of our proposed framework compared to existing mixup variants.
Distributionally and Adversarially Robust Logistic Regression via Intersecting Wasserstein Balls
Selvi, Aras, Kreacic, Eleonora, Ghassemi, Mohsen, Potluru, Vamsi, Balch, Tucker, Veloso, Manuela
Empirical risk minimization often fails to provide robustness against adversarial attacks in test data, causing poor out-of-sample performance. Adversarially robust optimization (ARO) has thus emerged as the de facto standard for obtaining models that hedge against such attacks. However, while these models are robust against adversarial attacks, they tend to suffer severely from overfitting. To address this issue for logistic regression, we study the Wasserstein distributionally robust (DR) counterpart of ARO and show that this problem admits a tractable reformulation. Furthermore, we develop a framework to reduce the conservatism of this problem by utilizing an auxiliary dataset (e.g., synthetic, external, or out-of-domain data), whenever available, with instances independently sampled from a nonidentical but related ground truth. In particular, we intersect the ambiguity set of the DR problem with another Wasserstein ambiguity set that is built using the auxiliary dataset. We analyze the properties of the underlying optimization problem, develop efficient solution algorithms, and demonstrate that the proposed method consistently outperforms benchmark approaches on real-world datasets.
Frugal Coordinate Descent for Large-Scale NNLS
Potluru, Vamsi (University of New Mexico)
The Nonnegative Least Squares (NNLS) formulation arises in many important regression problems. We present a novel coordinate descent method which differs from previous approaches in that we do not explicitly maintain complete gradient information. Empirical evidence shows that our approach outperforms a state-of-the-art NNLS solver in computation time for calculating radiation dosage for cancer treatment problems.