Goto

Collaborating Authors

 Bayesian Inference



A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

arXiv.org Machine Learning

We consider the problem of heteroskedastic generalized linear bandits (GLBs) with adversarial corruptions, which subsumes various stochastic contextual bandit settings, including heteroskedastic linear bandits and logistic/Poisson bandits. We propose HCW-GLB-OMD, which consists of two components: an online mirror descent (OMD)-based estimator and Hessian-based confidence weights to achieve corruption robustness. This is computationally efficient in that it only requires ${O}(1)$ space and time complexity per iteration. Under the self-concordance assumption on the link function, we show a regret bound of $\tilde{O}\left( d \sqrt{\sum_t g(ฯ„_t) \dotฮผ_{t,\star}} + d^2 g_{\max} ฮบ+ d ฮบC \right)$, where $\dotฮผ_{t,\star}$ is the slope of $ฮผ$ around the optimal arm at time $t$, $g(ฯ„_t)$'s are potentially exogenously time-varying dispersions (e.g., $g(ฯ„_t) = ฯƒ_t^2$ for heteroskedastic linear bandits, $g(ฯ„_t) = 1$ for Bernoulli and Poisson), $g_{\max} = \max_{t \in [T]} g(ฯ„_t)$ is the maximum dispersion, and $C \geq 0$ is the total corruption budget of the adversary. We complement this with a lower bound of $\tildeฮฉ(d \sqrt{\sum_t g(ฯ„_t) \dotฮผ_{t,\star}} + d C)$, unifying previous problem-specific lower bounds. Thus, our algorithm achieves, up to a $ฮบ$-factor in the corruption term, instance-wise minimax optimality simultaneously across various instances of heteroskedastic GLBs with adversarial corruptions.


Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood

arXiv.org Machine Learning

Policy inference plays an essential role in the contextual bandit problem. In this paper, we use empirical likelihood to develop a Bayesian inference method for the joint analysis of multiple contextual bandit policies in finite sample regimes. The proposed inference method is robust to small sample sizes and is able to provide accurate uncertainty measurements for policy value evaluation. In addition, it allows for flexible inferences on policy comparison with full uncertainty quantification. We demonstrate the effectiveness of the proposed inference method using Monte Carlo simulations and its application to an adolescent body mass index data set.


The Memory Perturbation Equation: Understanding Model's Sensitivity to Data Peter Nickl

Neural Information Processing Systems

Understanding model's sensitivity to its training data is crucial but can also be challenging and costly, especially during training. To simplify such issues, we present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data.



PAC-Bayes under potentially heavy tails

Neural Information Processing Systems

WederivePAC-Bayesian learning guarantees forheavy-tailed losses, andobtain a novel optimal Gibbs posterior which enjoys finite-sample excess risk bounds atlogarithmic confidence. Ourcoretechnique itselfmakesuseofPAC-Bayesian inequalities in order to derive a robust risk estimator, which by design is easy to compute.





An Algorithm for Learning Switched Linear Dynamics from Data Guillaume Berger Monal Narasimhamurthy

Neural Information Processing Systems

We present an algorithm for learning switched linear dynamical systems in discrete time from noisy observations of the system's full state or output. Switched linear systems use multiple linear dynamical modes to fit the data within some desired tolerance. They arise quite naturally in applications to robotics and cyberphysical systems. Learning switched systems from data is a NP-hard problem that is nearly identical to the k-linear regression problem of fitting k > 1 linear models to the data. A direct mixed-integer linear programming (MILP) approach yields time complexity that is exponential in the number of data points. In this paper, we modify the problem formulation to yield an algorithm that is linear in the size of the data while remaining exponential in the number of state variables and the desired number of modes. To do so, we combine classic ideas from the ellipsoidal method for solving convex optimization problems, and well-known oracle separation results in non-smooth optimization. We demonstrate our approach on a set of microbenchmarks and a few interesting real-world problems. Our evaluation suggests that the benefits of this algorithm can be made practical even against highly optimized off-the-shelf MILP solvers.