Goto

Collaborating Authors

 Optimization


Deep One-bit Compressive Autoencoding

arXiv.org Machine Learning

Parameterized mathematical models play a central role in understanding and design of complex information systems. However, they often cannot take into account the intricate interactions innate to such systems. On the contrary, purely data-driven approaches do not need explicit mathematical models for data generation and have a wider applicability at the cost of interpretability. In this paper, we consider the design of a one-bit compressive autoencoder, and propose a novel hybrid model-based and data-driven methodology that allows us to not only design the sensing matrix for one-bit data acquisition, but also allows for learning the latent-parameters of an iterative optimization algorithm specifically designed for the problem of one-bit sparse signal recovery. Our results demonstrate a significant improvement compared to state-of-the-art model-based algorithms.



Analysis of the Optimization Landscapes for Overcomplete Representation Learning

arXiv.org Machine Learning

We study nonconvex optimization landscapes for learning overcomplete representations, including learning (i) sparsely used overcomplete dictionaries and (ii) convolutional dictionaries, where these unsupervised learning problems find many applications in high-dimensional data analysis. Despite the empirical success of simple nonconvex algorithms, theoretical justifications of why these methods work so well are far from satisfactory. In this work, we show these problems can be formulated as $\ell^4$-norm optimization problems with spherical constraint, and study the geometric properties of their nonconvex optimization landscapes. For both problems, we show the nonconvex objectives have benign (global) geometric structures, in the sense that every local minimizer is close to one of the target solutions and every saddle point exhibits negative curvature. This discovery enables the development of guaranteed global optimization methods using simple initializations. For both problems, we show the nonconvex objectives have benign geometric structures -- every local minimizer is close to one of the target solutions and every saddle point exhibits negative curvature -- either in the entire space or within a sufficiently large region. This discovery ensures local search algorithms (such as Riemannian gradient descent) with simple initializations approximately find the target solutions. Finally, numerical experiments justify our theoretical discoveries.


Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods

arXiv.org Artificial Intelligence

The policy gradient theorem is defined based on an objective with respect to the initial distribution over states. In the discounted case, this results in policies that are optimal for one distribution over initial states, but may not be uniformly optimal for others, no matter where the agent starts from. Furthermore, to obtain unbiased gradient estimates, the starting point of the policy gradient estimator requires sampling states from a normalized discounted weighting of states. However, the difficulty of estimating the normalized discounted weighting of states, or the stationary state distribution, is quite well-known. Additionally, the large sample complexity of policy gradient methods is often attributed to insufficient exploration, and to remedy this, it is often assumed that the restart distribution provides sufficient exploration in these algorithms. In this work, we propose exploration in policy gradient methods based on maximizing entropy of the discounted future state distribution. The key contribution of our work includes providing a practically feasible algorithm to estimate the normalized discounted weighting of states, i.e, the \textit{discounted future state distribution}. We propose that exploration can be achieved by entropy regularization with the discounted state distribution in policy gradients, where a metric for maximal coverage of the state space can be based on the entropy of the induced state distribution. The proposed approach can be considered as a three time-scale algorithm and under some mild technical conditions, we prove its convergence to a locally optimal policy. Experimentally, we demonstrate usefulness of regularization with the discounted future state distribution in terms of increased state space coverage and faster learning on a range of complex tasks.


Exploiting Model Sparsity in Adaptive MPC: A Compressed Sensing Viewpoint

arXiv.org Machine Learning

This paper proposes an Adaptive Stochastic Model Predictive Control (MPC) strategy for stable linear time-invariant systems in the presence of bounded disturbances. We consider multi-input, multi-output systems that can be expressed by a Finite Impulse Response (FIR) model. The parameters of the FIR model corresponding to each output are unknown but assumed sparse. We estimate these parameters using the Recursive Least Squares algorithm. The estimates are then improved using set-based bounds obtained by solving the Basis Pursuit Denoising [1] problem. Our approach is able to handle hard input constraints and probabilistic output constraints. Using tools from distributionally robust optimization, we reformulate the probabilistic output constraints as tractable convex second-order cone constraints, which enables us to pose our MPC design task as a convex optimization problem. The efficacy of the developed algorithm is highlighted with a thorough numerical example, where we demonstrate performance gain over the counterpart algorithm of [2], which does not utilize the sparsity information of the system impulse response parameters during control design.


Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

arXiv.org Machine Learning

In real-world decision-making problems, for instance in the fields of finance, robotics or autonomous driving, keeping uncertainty under control is as important as maximizing expected returns. Risk aversion has been addressed in the reinforcement learning literature through risk measures related to the variance of returns. However, in many cases, the risk is measured not only on a long-term perspective, but also on the step-wise rewards (e.g., in trading, to ensure the stability of the investment bank, it is essential to monitor the risk of portfolio positions on a daily basis). In this paper, we define a novel measure of risk, which we call reward volatility, consisting of the variance of the rewards under the state-occupancy measure. We show that the reward volatility bounds the return variance so that reducing the former also constrains the latter. We derive a policy gradient theorem with a new objective function that exploits the mean-volatility relationship, and develop an actor-only algorithm. Furthermore, thanks to the linearity of the Bellman equations defined under the new objective function, it is possible to adapt the well-known policy gradient algorithms with monotonic improvement guarantees such as TRPO in a risk-averse manner. Finally, we test the proposed approach in two simulated financial environments.


A case study of Consistent Vehicle Routing Problem with Time Windows

arXiv.org Artificial Intelligence

We develop a heuristic solution method for the Consistent Vehicle Routing Problem with Time Windows (ConVRPTW), motivated by a real-world application at a distribution center of a food company. Additional to standard VRPTW restrictions, ConVRP assigns to each customer just one fixed driver to fulfill their orders during the complete multi-period planning horizon. For each driver and day of the planning horizon, a route has to be determined to serve all their assigned customers with positive demand. The customers do not buy every day and the frequency with which they do so is irregular. Moreover, the quantities ordered change from one order to another. This causes difficulties in the daily routing, negatively impacting the service level of the company. Unlike the previous works on ConVRP, where the number of drivers is fixed a priori and only the total travel time is minimized, we give priority to minimizing the number of drivers. To evaluate the performance of the heuristic, we compare the solution of the heuristic with the routing plan in use by the food company. The results show significant improvements, with a lower number of trucks and a higher rate of orders delivered within the prescribed time window.


Risk-Aware MMSE Estimation

arXiv.org Machine Learning

Despite the simplicity and intuitive interpretation of Minimum Mean Squared Error (MMSE) estimators, their effectiveness in certain scenarios is questionable. Indeed, minimizing squared errors on average does not provide any form of stability, as the volatility of the estimation error is left unconstrained. When this volatility is statistically significant, the difference between the average and realized performance of the MMSE estimator can be drastically different. To address this issue, we introduce a new risk-aware MMSE formulation which trades between mean performance and risk by explicitly constraining the expected predictive variance of the involved squared error. We show that, under mild moment boundedness conditions, the corresponding risk-aware optimal solution can be evaluated explicitly, and has the form of an appropriately biased nonlinear MMSE estimator. We further illustrate the effectiveness of our approach via several numerical examples, which also showcase the advantages of risk-aware MMSE estimation against risk-neutral MMSE estimation, especially in models involving skewed, heavy-tailed distributions.


Optimization algorithms inspired by the geometry of dissipative systems

arXiv.org Machine Learning

Optimization algorithms inspired by the geometry of dissipative systems Alessandro Bravetti 1, Maria L. Daza-Torres 2, Hugo Flores-Arguedas 3, and Michael Betancourt 4 1 Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas (IIMAS), Universidad Nacional Autónoma de México, A.P. 70-543, 04510 Ciudad de México, México alessandro.bravetti@iimas.unam.mx 2 Universidad de Guadalajara, Guadalajara, México, mdazatorres@cimat.mx 3 Centro de Investigación en Matemáticas (CIMAT), Guanajuato, México, hugo.flores@cimat.mx 4 Symplectomorphic LLC, New York, USA, betan@symplectomorphic.com Abstract Accelerated gradient methods are a powerful optimization tool in machine learning and statistics but their development has traditionally been driven by heuristic motivations. Recent research, however, has demonstrated that these methods can be derived as discretizations of dynamical systems, which in turn has provided a basis for more systematic investigations, especially into the structure of those dynamical systems and their structure preserving discretizations. In this work we introduce dynamical systems defined through a contact geometry which are not only naturally suited to the optimization goal but also subsume all previous methods based on geometric dynamical systems. These contact dynamical systems also admit a natural, robust discretization through geometric contact integrators. We demonstrate these features in paradigmatic examples which show that we can indeed obtain optimization algorithms that achieve oracle lower bounds on convergence rates while also improving on previous proposals in terms of stability. Keywords: optimization, accelerated gradient, geometric integrators, contact geometry 1 arXiv:1912.02928v1 Despite their practical utility and explicit convergence bounds, accelerated gradient methods have long been difficult to motivate from a fundamental theory.


A Clustering Approach to Edge Controller Placement in Software Defined Networks with Cost Balancing

arXiv.org Machine Learning

A Clustering Approach to Edge Controller Placement in Software Defined Networks with Cost Balancing Reza Soleymanifar, Amber Srivastava, Carolyn Beck, Srinivasa Salapaka Abstract -- In this work we introduce two novel deterministic annealing based clustering algorithms to address the problem of Edge Controller Placement (ECP) in wireless edge networks. These networks lie at the core of the fifth generation (5G) wireless systems and beyond. These algorithms, ECP-LL and ECP-LB, address the dominant leader-less and leader-based controller placement topologies and have linear computational complexity in terms of network size, maximum number of clusters and dimensionality of data. Each algorithm tries to place controllers close to edge node clusters and not far away from other controllers to maintain a reasonable balance between synchronization and delay costs. While the ECP problem can be conveniently expressed as a multi-objective mixed integer nonlinear program (MINLP), our algorithms outperform state of art MINLP solver, BARON both in terms of accuracy and speed. Our proposed algorithms have the competitive edge of avoiding poor local minima through a Shannon entropy term in the clustering objective function. Most ECP algorithms are highly susceptible to poor local minima and greatly depend on initialization. Keywords: Clustering, deterministic annealing, 5G networks, software defined networks, wireless edge networks, edge controller placement I.