Optimization
Logarithmic Regret from Sublinear Hints
Bhaskara, Aditya, Cutkosky, Ashok, Kumar, Ravi, Purohit, Manish
We consider the online linear optimization problem, where at every step the algorithm plays a point $x_t$ in the unit ball, and suffers loss $\langle c_t, x_t\rangle$ for some cost vector $c_t$ that is then revealed to the algorithm. Recent work showed that if an algorithm receives a hint $h_t$ that has non-trivial correlation with $c_t$ before it plays $x_t$, then it can achieve a regret guarantee of $O(\log T)$, improving on the bound of $\Theta(\sqrt{T})$ in the standard setting. In this work, we study the question of whether an algorithm really requires a hint at every time step. Somewhat surprisingly, we show that an algorithm can obtain $O(\log T)$ regret with just $O(\sqrt{T})$ hints under a natural query model; in contrast, we also show that $o(\sqrt{T})$ hints cannot guarantee better than $\Omega(\sqrt{T})$ regret. We give two applications of our result, to the well-studied setting of optimistic regret bounds and to the problem of online learning with abstention.
Misspecified Gaussian Process Bandit Optimization
Bogunovic, Ilija, Krause, Andreas
We consider the problem of optimizing a black-box function based on noisy bandit feedback. Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem. They heavily rely on the assumption that the model is well-specified, however, and can fail without it. Instead, we introduce a \emph{misspecified} kernelized bandit setting where the unknown function can be $\epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS). We design efficient and practical algorithms whose performance degrades minimally in the presence of model misspecification. Specifically, we present two algorithms based on Gaussian process (GP) methods: an optimistic EC-GP-UCB algorithm that requires knowing the misspecification error, and Phased GP Uncertainty Sampling, an elimination-type algorithm that can adapt to unknown model misspecification. We provide upper bounds on their cumulative regret in terms of $\epsilon$, the time horizon, and the underlying kernel, and we show that our algorithm achieves optimal dependence on $\epsilon$ with no prior knowledge of misspecification. In addition, in a stochastic contextual setting, we show that EC-GP-UCB can be effectively combined with the regret bound balancing strategy and attain similar regret bounds despite not knowing $\epsilon$.
Nonnegative Tensor Completion via Integer Optimization
Bugg, Caleb, Chen, Chen, Aswani, Anil
Unlike matrix completion, no algorithm for the tensor completion problem has so far been shown to achieve the information-theoretic sample complexity rate. This paper develops a new algorithm for the special case of completion for nonnegative tensors. We prove that our algorithm converges in a linear (in numerical tolerance) number of oracle steps, while achieving the information-theoretic rate. Our approach is to define a new norm for nonnegative tensors using the gauge of a specific 0-1 polytope that we construct. Because the norm is defined using a 0-1 polytope, this means we can use integer linear programming to solve linear separation problems over the polytope. We combine this insight with a variant of the Frank-Wolfe algorithm to construct our numerical algorithm, and we demonstrate its effectiveness and scalability through experiments.
Optimizing Bayesian acquisition functions in Gaussian Processes
Pawar, Ashish Anil, Warbhe, Ujwal
Bayesian optimization is a popular optimization technique for optimizing a black box function especially with high dimensions. For a known objective functions, various optimization functions are readily available to choose from. For a black box function, since the true nature of the objective function is unknown, many available optimization techniques including Gradient Descent cannot be applied. For a black box function, various other optimization techniques are available such as Grid Search and Random Search, however, both of these techniques are extremely inefficient and time consuming specially if the objective function is costly to execute. Instead, Bayesian optimization tries to find the global optimum by using a surrogate function to evaluate the real objective function, thus, making the computation much efficient with respect to time or money.
Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers
Katz-Samuels, Julian, Mason, Blake, Jamieson, Kevin, Nowak, Rob
We consider interactive learning in the realizable setting and develop a general framework to handle problems ranging from best arm identification to active classification. We begin our investigation with the observation that agnostic algorithms \emph{cannot} be minimax-optimal in the realizable setting. Hence, we design novel computationally efficient algorithms for the realizable setting that match the minimax lower bound up to logarithmic factors and are general-purpose, accommodating a wide variety of function classes including kernel methods, H{\"o}lder smooth functions, and convex functions. The sample complexities of our algorithms can be quantified in terms of well-known quantities like the extended teaching dimension and haystack dimension. However, unlike algorithms based directly on those combinatorial quantities, our algorithms are computationally efficient. To achieve computational efficiency, our algorithms sample from the version space using Monte Carlo "hit-and-run" algorithms instead of maintaining the version space explicitly. Our approach has two key strengths. First, it is simple, consisting of two unifying, greedy algorithms. Second, our algorithms have the capability to seamlessly leverage prior knowledge that is often available and useful in practice. In addition to our new theoretical results, we demonstrate empirically that our algorithms are competitive with Gaussian process UCB methods.
The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle
Kong, Fang, Yang, Yueran, Chen, Wei, Li, Shuai
Thompson sampling (TS) has attracted a lot of interest in the bandit area. It was introduced in the 1930s but has not been theoretically proven until recent years. All of its analysis in the combinatorial multi-armed bandit (CMAB) setting requires an exact oracle to provide optimal solutions with any input. However, such an oracle is usually not feasible since many combinatorial optimization problems are NP-hard and only approximation oracles are available. An example (Wang and Chen, 2018) has shown the failure of TS to learn with an approximation oracle. However, this oracle is uncommon and is designed only for a specific problem instance. It is still an open question whether the convergence analysis of TS can be extended beyond the exact oracle in CMAB. In this paper, we study this question under the greedy oracle, which is a common (approximation) oracle with theoretical guarantees to solve many (offline) combinatorial optimization problems. We provide a problem-dependent regret lower bound of order $\Omega(\log T/\Delta^2)$ to quantify the hardness of TS to solve CMAB problems with greedy oracle, where $T$ is the time horizon and $\Delta$ is some reward gap. We also provide an almost matching regret upper bound. These are the first theoretical results for TS to solve CMAB with a common approximation oracle and break the misconception that TS cannot work with approximation oracles.
Real-World Mathematical Optimization Problems
In our daily lives, we benefit from the application of Mathematical Optimization algorithms. They are used, for example, by GPS systems, by shipping companies delivering packages to our homes, by financial companies, airline reservations systems, etc. As a thank you - feel free to message me to get one of my other courses for free. Because this course can be very well complemented by my other courses. We are learning the software FICO Xpress Optimization; and this is the most stable software out there.
Optimization for Machine Learning Crash Course
All machine learning models involve optimization. Most likely, we use computational algorithms to optimize. There are many ways to optimize numerically. SciPy has a number of functions handy for this. We can also try to implement the optimization algorithms on our own. In this crash course, you will discover how you can get started and confidently run algorithms to optimize a function with Python in seven days. This is a big and important post. You might want to bookmark it. Optimization for Machine Learning (7-Day Mini-Course) Photo by Brewster Malevich, some rights reserved.
Optimization of the Model Predictive Control Meta-Parameters Through Reinforcement Learning
Bøhn, Eivind, Gros, Sebastien, Moe, Signe, Johansen, Tor Arne
Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources in battery-powered embedded systems. The MPC parameters must be tuned, which is largely a trial-and-error process that affects the control performance, the robustness and the computational complexity of the controller to a high degree. In this paper, we propose a novel framework in which any parameter of the control algorithm can be jointly tuned using reinforcement learning(RL), with the goal of simultaneously optimizing the control performance and the power usage of the control algorithm. We propose the novel idea of optimizing the meta-parameters of MPCwith RL, i.e. parameters affecting the structure of the MPCproblem as opposed to the solution to a given problem. Our control algorithm is based on an event-triggered MPC where we learn when the MPC should be re-computed, and a dual mode MPC and linear state feedback control law applied in between MPC computations. We formulate a novel mixture-distribution policy and show that with joint optimization we achieve improvements that do not present themselves when optimizing the same parameters in isolation. We demonstrate our framework on the inverted pendulum control task, reducing the total computation time of the control system by 36% while also improving the control performance by 18.4% over the best-performing MPC baseline.
Sampling from Log-Concave Distributions with Infinity-Distance Guarantees and Applications to Differentially Private Optimization
Mangoubi, Oren, Vishnoi, Nisheeth K.
For a $d$-dimensional log-concave distribution $\pi(\theta)\propto e^{-f(\theta)}$ on a polytope $K$, we consider the problem of outputting samples from a distribution $\nu$ which is $O(\varepsilon)$-close in infinity-distance $\sup_{\theta\in K}|\log\frac{\nu(\theta)}{\pi(\theta)}|$ to $\pi$. Such samplers with infinity-distance guarantees are specifically desired for differentially private optimization as traditional sampling algorithms which come with total-variation distance or KL divergence bounds are insufficient to guarantee differential privacy. Our main result is an algorithm that outputs a point from a distribution $O(\varepsilon)$-close to $\pi$ in infinity-distance and requires $O((md+dL^2R^2)\times(LR+d\log(\frac{Rd+LRd}{\varepsilon r}))\times md^{\omega-1})$ arithmetic operations, where $f$ is $L$-Lipschitz, $K$ is defined by $m$ inequalities, is contained in a ball of radius $R$ and contains a ball of smaller radius $r$, and $\omega$ is the matrix-multiplication constant. In particular this runtime is logarithmic in $\frac{1}{\varepsilon}$ and significantly improves on prior works. Technically, we depart from the prior works that construct Markov chains on a $\frac{1}{\varepsilon^2}$-discretization of $K$ to achieve a sample with $O(\varepsilon)$ infinity-distance error, and present a method to convert continuous samples from $K$ with total-variation bounds to samples with infinity bounds. To achieve improved dependence on $d$, we present a "soft-threshold" version of the Dikin walk which may be of independent interest. Plugging our algorithm into the framework of the exponential mechanism yields similar improvements in the running time of $\varepsilon$-pure differentially private algorithms for optimization problems such as empirical risk minimization of Lipschitz-convex functions and low-rank approximation, while still achieving the tightest known utility bounds.