Rivera, Eduardo Ochoa
Near Optimal Pure Exploration in Logistic Bandits
Rivera, Eduardo Ochoa, Tewari, Ambuj
Bandit algorithms have garnered significant attention due to their practical applications in real-world scenarios. However, beyond simple settings such as multi-arm or linear bandits, optimal algorithms remain scarce. Notably, no optimal solution exists for pure exploration problems in the context of generalized linear model (GLM) bandits. In this paper, we narrow this gap and develop the first track-and-stop algorithm for general pure exploration problems under the logistic bandit called logistic track-and-stop (Log-TS). Log-TS is an efficient algorithm that asymptotically matches an approximation for the instance-specific lower bound of the expected sample complexity up to a logarithmic factor.
Conformalized Late Fusion Multi-View Learning
Rivera, Eduardo Ochoa, Patel, Yash, Tewari, Ambuj
Uncertainty quantification for multi-view learning is motivated by the increasing use of multi-view data in scientific problems. A common variant of multi-view learning is late fusion: train separate predictors on individual views and combine them after single-view predictions are available. Existing methods for uncertainty quantification for late fusion often rely on undesirable distributional assumptions for validity. Conformal prediction is one approach that avoids such distributional assumptions. However, naively applying conformal prediction to late-stage fusion pipelines often produces overly conservative and uninformative prediction regions, limiting its downstream utility. We propose a novel methodology, Multi-View Conformal Prediction (MVCP), where conformal prediction is instead performed separately on the single-view predictors and only fused subsequently. Our framework extends the standard scalar formulation of a score function to a multivariate score that produces more efficient downstream prediction regions in both classification and regression settings. We then demonstrate that such improvements can be realized in methods built atop conformalized regressors, specifically in robust predict-then-optimize pipelines.
Optimal Thresholding Linear Bandit
Rivera, Eduardo Ochoa, Tewari, Ambuj
The Thresholding Bandit Problem (TBP) Andrea Locatelli (2016); Kano et al. (2019) represents a specific combinatorial The study by Kano et al. (2019) emphasizes that in certain contexts, such as personalized recommendations, pursuing In this scenario, the arms' mean rewards follow a linear model with unknown parameters. We prove an instance-specific lower bound for the expected sample complexity of any correct algorithm.