Optimization
Hybrid Deterministic-Stochastic Methods for Data Fitting
Friedlander, Michael P., Schmidt, Mark
Many structured data-fitting applications require the solution of an optimization problem involving a sum over a potentially large number of measurements. Incremental gradient algorithms offer inexpensive iterations by sampling a subset of the terms in the sum. These methods can make great progress initially, but often slow as they approach a solution. In contrast, full-gradient methods achieve steady convergence at the expense of evaluating the full objective and gradient on each iteration. We explore hybrid methods that exhibit the benefits of both approaches. Rate-of-convergence analysis shows that by controlling the sample size in an incremental gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods. We detail a practical quasi-Newton implementation based on this approach. Numerical experiments illustrate its potential benefits.
Quadratic Basis Pursuit
Ohlsson, Henrik, Yang, Allen Y., Dong, Roy, Verhaegen, Michel, Sastry, S. Shankar
Y ang, Member, IEEE, Roy Dong, Michel V erhaegen, S. Shankar Sastry, Fellow, IEEE Abstract--In many compressive sensing problems today, the relationship between the measurements and the unknowns could be nonlinear . Traditional treatment of such nonlinear relationships have been to approximate the nonlinearity via a linear model and the subsequent un-modeled dynamics as noise. The ability to more accurately characterize nonlinear models has the potential to improve the results in both existing compressive sensing applications and those where a linear approximation does not suffice, e.g., phase retrieval. In this paper, we extend the classical compressive sensing framework to a second-order T aylor expansion of the nonlinearity. Using a lifting technique and a method we call quadratic basis pursuit, we show that the sparse signal can be recovered exactly when the sampling rate is sufficiently high. We further present efficient numerical algorithms to recover sparse signals in second-order nonlinear systems, which are considerably more difficult to solve than their linear counterparts in sparse optimization. I NTRODUCTION Consider the problem of finding the sparsest signalx satisfying a system of linear equations: min x R n โ x โ 0 subj. One of the most well known approaches is to relax the zero norm and replace it with the 1-norm: min x R n โ x โ 1 subj. The ability to recover the optimal solution to (1) is essential in the theory of compressive sensing (CS) [4], [5] and a tremendous amount of work has been dedicated to solving and analyzing the solution of (1) and (2) in the last decade. Today CS is regarded as a powerful tool in signal processing and widely used in many applications. For a detailed review of the literature, the reader is referred to several recent publications such as [6], [7].
Exact Sparse Recovery with L0 Projections
Many applications concern sparse signals, for example, detecting anomalies from the differences between consecutive images taken by surveillance cameras. This paper focuses on the problem of recovering a K-sparse signal x in N dimensions. In the mainstream framework of compressed sensing (CS), the vector x is recovered from M non-adaptive linear measurements y = xS, where S (of size N x M) is typically a Gaussian (or Gaussian-like) design matrix, through some optimization procedure such as linear programming (LP). In our proposed method, the design matrix S is generated from an $\alpha$-stable distribution with $\alpha\approx 0$. Our decoding algorithm mainly requires one linear scan of the coordinates, followed by a few iterations on a small number of coordinates which are "undetermined" in the previous iteration. Comparisons with two strong baselines, linear programming (LP) and orthogonal matching pursuit (OMP), demonstrate that our algorithm can be significantly faster in decoding speed and more accurate in recovery quality, for the task of exact spare recovery. Our procedure is robust against measurement noise. Even when there are no sufficient measurements, our algorithm can still reliably recover a significant portion of the nonzero coordinates. To provide the intuition for understanding our method, we also analyze the procedure by assuming an idealistic setting. Interestingly, when K=2, the "idealized" algorithm achieves exact recovery with merely 3 measurements, regardless of N. For general K, the required sample size of the "idealized" algorithm is about 5K.
Factoring nonnegative matrices with linear programs
Bittorf, Victor, Recht, Benjamin, Re, Christopher, Tropp, Joel A.
This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes.
Flexible and Approximate Computation through State-Space Reduction
In the real world, insufficient information, limited computation resources, and complex problem structures often force an autonomous agent to make a decision in time less than that required to solve the problem at hand completely. Flexible and approximate computations are two approaches to decision making under limited computation resources. Flexible computation helps an agent to flexibly allocate limited computation resources so that the overall system utility is maximized. Approximate computation enables an agent to find the best satisfactory solution within a deadline. In this paper, we present two state-space reduction methods for flexible and approximate computation: quantitative reduction to deal with inaccurate heuristic information, and structural reduction to handle complex problem structures. These two methods can be applied successively to continuously improve solution quality if more computation is available. Our results show that these reduction methods are effective and efficient, finding better solutions with less computation than some existing well-known methods.
Continuous Value Function Approximation for Sequential Bidding Policies
Boutilier, Craig, Goldszmidt, Moises, Sabata, Bikash
Market-based mechanisms such as auctions are being studied as an appropriate means for resource allocation in distributed and mulitagent decision problems. When agents value resources in combination rather than in isolation, they must often deliberate about appropriate bidding strategies for a sequence of auctions offering resources of interest. We briefly describe a discrete dynamic programming model for constructing appropriate bidding policies for resources exhibiting both complementarities and substitutability. We then introduce a continuous approximation of this model, assuming that money (or the numeraire good) is infinitely divisible. Though this has the potential to reduce the computational cost of computing policies, value functions in the transformed problem do not have a convenient closed form representation. We develop {em grid-based} approximation for such value functions, representing value functions using piecewise linear approximations. We show that these methods can offer significant computational savings with relatively small cost in solution quality.
Active Learning of Inverse Models with Intrinsically Motivated Goal Exploration in Robots
Baranes, Adrien, Oudeyer, Pierre-Yves
We introduce the Self-Adaptive Goal Generation - Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture as an intrinsi- cally motivated goal exploration mechanism which allows active learning of inverse models in high-dimensional redundant robots. This allows a robot to efficiently and actively learn distributions of parameterized motor skills/policies that solve a corresponding distribution of parameterized tasks/goals. The architecture makes the robot sample actively novel parameterized tasks in the task space, based on a measure of competence progress, each of which triggers low-level goal-directed learning of the motor policy pa- rameters that allow to solve it. For both learning and generalization, the system leverages regression techniques which allow to infer the motor policy parameters corresponding to a given novel parameterized task, and based on the previously learnt correspondences between policy and task parameters. We present experiments with high-dimensional continuous sensorimotor spaces in three different robotic setups: 1) learning the inverse kinematics in a highly-redundant robotic arm, 2) learning omnidirectional locomotion with motor primitives in a quadruped robot, 3) an arm learning to control a fishing rod with a flexible wire. We show that 1) exploration in the task space can be a lot faster than exploration in the actuator space for learning inverse models in redundant robots; 2) selecting goals maximizing competence progress creates developmental trajectories driving the robot to progressively focus on tasks of increasing complexity and is statistically significantly more efficient than selecting tasks randomly, as well as more efficient than different standard active motor babbling methods; 3) this architecture allows the robot to actively discover which parts of its task space it can learn to reach and which part it cannot.
Efficient Sparse Group Feature Selection via Nonconvex Optimization
Xiang, Shuo, Shen, Xiaotong, Ye, Jieping
Sparse feature selection has been demonstrated to be effective in handling high-dimensional data. While promising, most of the existing works use convex methods, which may be suboptimal in terms of the accuracy of feature selection and parameter estimation. In this paper, we expand a nonconvex paradigm to sparse group feature selection, which is motivated by applications that require identifying the underlying group structure and performing feature selection simultaneously. The main contributions of this article are twofold: (1) statistically, we introduce a nonconvex sparse group feature selection model which can reconstruct the oracle estimator. Therefore, consistent feature selection and parameter estimation can be achieved; (2) computationally, we propose an efficient algorithm that is applicable to large-scale problems. Numerical results suggest that the proposed nonconvex method compares favorably against its competitors on synthetic data and real-world applications, thus achieving desired goal of delivering high performance.
Financial Portfolio Optimization: Computationally guided agents to investigate, analyse and invest!?
Financial portfolio optimization is a widely studied problem in mathematics, statistics, financial and computational literature. It adheres to determining an optimal combination of weights associated with financial assets held in a portfolio. In practice, it faces challenges by virtue of varying math. formulations, parameters, business constraints and complex financial instruments. Empirical nature of data is no longer one-sided; thereby reflecting upside and downside trends with repeated yet unidentifiable cyclic behaviours potentially caused due to high frequency volatile movements in asset trades. Portfolio optimization under such circumstances is theoretically and computationally challenging. This work presents a novel mechanism to reach an optimal solution by encoding a variety of optimal solutions in a solution bank to guide the search process for the global investment objective formulation. It conceptualizes the role of individual solver agents that contribute optimal solutions to a bank of solutions, a super-agent solver that learns from the solution bank, and, thus reflects a knowledge-based computationally guided agents approach to investigate, analyse and reach to optimal solution for informed investment decisions. Conceptual understanding of classes of solver agents that represent varying problem formulations and, mathematically oriented deterministic solvers along with stochastic-search driven evolutionary and swarm-intelligence based techniques for optimal weights are discussed. Algorithmic implementation is presented by an enhanced neighbourhood generation mechanism in Simulated Annealing algorithm. A framework for inclusion of heuristic knowledge and human expertise from financial literature related to investment decision making process is reflected via introduction of controlled perturbation strategies using a decision matrix for neighbourhood generation.
Robust PCA and subspace tracking from incomplete observations using L0-surrogates
Hage, Clemens, Kleinsteuber, Martin
Many applications in data analysis rely on the decomposition of a data matrix into a low-rank and a sparse component. Existing methods that tackle this task use the nuclear norm and L1-cost functions as convex relaxations of the rank constraint and the sparsity measure, respectively, or employ thresholding techniques. We propose a method that allows for reconstructing and tracking a subspace of upper-bounded dimension from incomplete and corrupted observations. It does not require any a priori information about the number of outliers. The core of our algorithm is an intrinsic Conjugate Gradient method on the set of orthogonal projection matrices, the so-called Grassmannian. Non-convex sparsity measures are used for outlier detection, which leads to improved performance in terms of robustly recovering and tracking the low-rank matrix. In particular, our approach can cope with more outliers and with an underlying matrix of higher rank than other state-of-the-art methods.