Goto

Collaborating Authors

 Optimization


Learning to Optimize via Information-Directed Sampling

Neural Information Processing Systems

We propose information-directed sampling - a new algorithm for online optimization problems in which a decision-maker must balance between exploration and exploitation while learning from partial feedback. Each action is sampled in a manner that minimizes the ratio between the square of expected single-period regret and a measure of information gain: the mutual information between the optimal action and the next observation. We establish an expected regret bound for information-directed sampling that applies across a very general class of models and scales with the entropy of the optimal action distribution. For the widely studied Bernoulli and linear bandit models, we demonstrate simulation performance surpassing popular approaches, including upper confidence bound algorithms, Thompson sampling, and knowledge gradient. Further, we present simple analytic examples illustrating that informationdirected sampling can dramatically outperform upper confidence bound algorithms and Thompson sampling due to the way it measures information gain.


Efficient learning by implicit exploration in bandit problems with side observations

Neural Information Processing Systems

We consider online learning problems under a a partial observability model capturing situations where the information conveyed to the learner is between full information and bandit feedback. In the simplest variant, we assume that in addition to its own loss, the learner also gets to observe losses of some other actions. The revealed losses depend on the learner's action and a directed observation system chosen by the environment. For this setting, we propose the first algorithm that enjoys near-optimal regret guarantees without having to know the observation system before selecting its actions. Along similar lines, we also define a new partial information setting that models online combinatorial optimization problems where the feedback received by the learner is between semi-bandit and full feedback. As the predictions of our first algorithm cannot be always computed efficiently in this setting, we propose another algorithm with similar properties and with the benefit of always being computationally efficient, at the price of a slightly more complicated tuning mechanism. Both algorithms rely on a novel exploration strategy called implicit exploration, which is shown to be more efficient both computationally and information-theoretically than previously studied exploration strategies for the problem.


Beyond the Birkhoff Polytope: Convex Relaxations for Vector Permutation Problems

Neural Information Processing Systems

We modify the recent convex formulation of the 2-SUM problem introduced by Fogel et al. [2] to use this polytope, and demonstrate how we can attain results of similar quality in significantly less computational time for large n. To our knowledge, this is the first usage of Goemans' compact formulation of the permutahedron in a convex optimization problem. We also introduce a simpler regularization scheme for this convex formulation of the 2-SUM problem that yields good empirical results.


Tight Bounds for Influence in Diffusion Networks and Application to Bond Percolation and Epidemiology

Neural Information Processing Systems

In this paper, we derive theoretical bounds for the long-term influence of a node in an Independent Cascade Model (ICM). We relate these bounds to the spectral radius of a particular matrix and show that the behavior is sub-critical when this spectral radius is lower than 1. More specifically, we point out that, in general networks, the sub-critical regime behaves in O( n) where n is the size of the network, and that this upper bound is met for star-shaped networks. We apply our results to epidemiology and percolation on arbitrary networks, and derive a bound for the critical value beyond which a giant connected component arises. Finally, we show empirically the tightness of our bounds for a large family of networks.


Recursive Inversion Models for Permutations

Neural Information Processing Systems

We develop a new exponential family probabilistic model for permutations that can capture hierarchical structure and that has the Mallows and generalized Mallows models as subclasses. We describe how to do parameter estimation and propose an approach to structure search for this class of models. We provide experimental evidence that this added flexibility both improves predictive performance and enables a deeper understanding of collections of permutations.


Predictive Entropy Search for Efficient Global Optimization of Black-box Functions

Neural Information Processing Systems

We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution. This reformulation allows PES to obtain approximations that are both more accurate and efficient than other alternatives such as Entropy Search (ES). Furthermore, PES can easily perform a fully Bayesian treatment of the model hyperparameters while ES cannot. We evaluate PES in both synthetic and realworld applications, including optimization problems in machine learning, finance, biotechnology, and robotics. We show that the increased accuracy of PES leads to significant gains in optimization performance.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

This paper considers efficient implementations of non-convex sparse learning formulations. Recent studies have shown that many convex sparse learning formulations are inferior to their non-convex counterparts in both theory and practice. However, it is still quite challenging to efficiently solve non-convex sparse optimization problems for large-scale data. This paper presents a novel algorithm HONOR which is applicable for a wide range of non-convex sparse learning formulations. One of the key ideas in HONOR is to incorporate the second-order information to greatly speed up the convergence, while unlike most existing second-order methods it avoids solving a regularized quadratic programming and only involves matrix-vector multiplications without explicitly forming the inverse Hessian matrix. Thus, HONOR has a low computational complexity at each iteration and it is scalable to large-size problems.


Review for NeurIPS paper: First-Order Methods for Large-Scale Market Equilibrium Computation

Neural Information Processing Systems

Weaknesses: After reading the authors response, I am convinced by the contribution of the authors. I really hope that they will include the answers that they provided to the final version of the paper because I find that it totally changes the reader's understanding of the paper. In particular, regarding the fact that obtaining a linear convergence rate is not obvious and was not done before. First, I found the novelty of the work a bit limited because not enough novelty was given in the "theory" side nor in the "numerical" side. The paper seems to "just" apply convex optimization methods to finding market equilibria and giving theoretical justifications for why this approach is sound.


Review for NeurIPS paper: Learning to Learn with Feedback and Local Plasticity

Neural Information Processing Systems

Weaknesses: As pointed out before, one major advantage of their method is that during learning no backpropagation is required. The authors rightly mention that the same advantage applies to meta-learning in recurrent neural networks (e.g. Not of all their contributions are novel: 1. "We provide support for the idea that the credit assignment problem itself may be viewed as an optimization problem, amenable to solution via meta-learning." This has been shown many times before, e.g. This also has been shown many times before, e.g.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

Overview: This paper studies the benefits of augmenting the linear programming relaxation of the maximum a-posteriori (MAP) inference problem in graphical models with a quadratic term, thereby achieving strong convexity. Such augmented formulations are obtained both from the original primal and dual formulations, and in each case the resulting primal-dual relationship is studied. Prior work has mostly focused on smoothing the LP formulation using a softmax/entropy term, with a few notable exceptions, such as [5], [17] and [18]. Rather than those previous approaches, which employ a quadratic term in the sub-problems of either a *proximal* or a *alternating direction* scheme, in the present manuscript, the quadratic smoothing term is added directly. This can in some way be seen as a naive approach: In comparison to proximal or alternating direction schemes, convergence to the global optimum of the original problem is no longer guaranteed, and the approximation quality directly depends on the strength of the augmentation term.