Goto

Collaborating Authors

 Optimization


Stochastic Approximation for Risk-aware Markov Decision Processes

arXiv.org Artificial Intelligence

The analysis of complex systems such as inventory control, financial markets, waste-to-energy plants and computer networks is difficult because of the inherent uncertainties in these systems. Risk-aware optimization offers a possible remedy by giving stronger reliability guarantees than the risk-neutral case. Furthermore, it allows expression of the risk attitude of the decision maker. Risk awareness is especially important in sequential decision making because of the dynamic nature of the uncertainty. Markov decision processes (MDPs) introduced by Bellman in [10] provide a mathematical framework for modeling sequential decision making in situations where outcomes are partly random and partly under the control the decision maker. However, in many cases the exact model of the underlying Markov decision process is not known and one can only observe the trajectory of states, actions, and rewards/costs.


Market Self-Learning of Signals, Impact and Optimal Trading: Invisible Hand Inference with Free Energy

arXiv.org Artificial Intelligence

We present a simple model of a non-equilibrium self-organizing market where asset prices are partially driven by investment decisions of a bounded-rational agent. The agent acts in a stochastic market environment driven by various exogenous "alpha" signals, agent's own actions (via market impact), and noise. Unlike traditional agent-based models, our agent aggregates all traders in the market, rather than being a representative agent. Therefore, it can be identified with a bounded-rational component of the market itself, providing a particular implementation of an Invisible Hand market mechanism. In such setting, market dynamics are modeled as a fictitious self-play of such bounded-rational market-agent in its adversarial stochastic environment. As rewards obtained by such self-playing market agent are not observed from market data, we formulate and solve a simple model of such market dynamics based on a neuroscience-inspired Bounded Rational Information Theoretic Inverse Reinforcement Learning (BRIT-IRL). This results in effective asset price dynamics with a non-linear mean reversion - which in our model is generated dynamically, rather than being postulated. We argue that our model can be used in a similar way to the Black-Litterman model. In particular, it represents, in a simple modeling framework, market views of common predictive signals, market impacts and implied optimal dynamic portfolio allocations, and can be used to assess values of private signals. Moreover, it allows one to quantify a "market-implied" optimal investment strategy, along with a measure of market rationality. Our approach is numerically light, and can be implemented using standard off-the-shelf software such as TensorFlow.


Efficient end-to-end learning for quantizable representations

arXiv.org Machine Learning

Embedding representation learning via neural networks is at the core foundation of modern similarity based search. While much effort has been put in developing algorithms for learning binary hamming code representations for search efficiency, this still requires a linear scan of the entire dataset per each query and trades off the search accuracy through binarization. To this end, we consider the problem of directly learning a quantizable embedding representation and the sparse binary hash code end-to-end which can be used to construct an efficient hash table not only providing significant search reduction in the number of data but also achieving the state of the art search accuracy outperforming previous state of the art deep metric learning methods. We also show that finding the optimal sparse binary hash code in a mini-batch can be computed exactly in polynomial time by solving a minimum cost flow problem. Our results on Cifar-100 and on ImageNet datasets show the state of the art search accuracy in precision@k and NMI metrics while providing up to 98X and 478X search speedup respectively over exhaustive linear search.


Local Saddle Point Optimization: A Curvature Exploitation Approach

arXiv.org Machine Learning

Gradient-based optimization methods are the most popular choice for finding local optima for classical minimization and saddle point problems. Here, we highlight a systemic issue of gradient dynamics that arise for saddle point problems, namely the presence of undesired stable stationary points that are no local optima. We propose a novel optimization approach that exploits curvature information in order to escape from these undesired stationary points. We prove that different optimization methods, including gradient method and adagrad, equipped with curvature exploitation can escape non-optimal stationary points. We also provide empirical results on common saddle point problems which confirm the advantage of using curvature exploitation.


Graph Signal Sampling via Reinforcement Learning

arXiv.org Artificial Intelligence

Modern information processing systems generate massive datasets which are often strongly heterogeneous, e.g., partially labeled mixtures of different media (audio, video, text). A quite successful approach to such datasets is based on representing the data as networks or graphs. In particular, we represent datasets by graph signals defined over an underlying graph, which reflects similarities between individual data points. The graph signal values encode label information which often conforms to a clustering hypothesis, i.e., the signal values (labels) of close-by nodes (similar data points) are similar. Two core problems considered within graph signal processing (GSP) are (i) how to sample them, i.e., which signal values provide the most information about the entire dataset, and (ii) how to recover the entire graph signal from these few signal values (samples). These problems have been studied in [1]-[6] which discussed convex optimization methods for recovering a graph signal from a small number of signal values observed on the nodes belonging to a given (small) sampling set. Sufficient conditions on the sampling set and clustering structure such that these convex methods are successful have been discussed in [4], [7].


The Global Optimization Geometry of Shallow Linear Neural Networks

arXiv.org Machine Learning

We examine the squared error loss landscape of shallow linear neural networks. By utilizing a regularizer on the training samples, we show---with significantly milder assumptions than previous works---that the corresponding optimization problems have benign geometric properties: there are no spurious local minima and the Hessian at every saddle point has at least one negative eigenvalue. This means that at every saddle point there is a directional negative curvature which algorithms can utilize to further decrease the objective value. These geometric properties imply that many local search algorithms---including gradient descent, which is widely utilized for training neural networks---can provably solve the training problem with global convergence. The additional regularizer has no effect on the global minimum value; rather, it plays a useful role in shrinking the set of critical points. Experiments show that this additional regularizer also speeds the convergence of iterative algorithms for solving the training optimization problem in certain cases.


Online Bandit Linear Optimization: A Study

arXiv.org Artificial Intelligence

This article introduces the concepts around Online Bandit Linear Optimization and explores an efficient setup called SCRiBLe (Self-Concordant Regularization in Bandit Learning) created by Abernethy et. al.\cite{abernethy}. The SCRiBLe setup and algorithm yield a $O(\sqrt{T})$ regret bound and polynomial run time complexity bound on the dimension of the input space. In this article we build up to the bandit linear optimization case and study SCRiBLe.


On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning

arXiv.org Machine Learning

In the context of machine learning, disparate impact refers to a form of systematic discrimination whereby the output distribution of a model depends on the value of a sensitive attribute (e.g., race or gender). In this paper, we propose an information-theoretic framework to analyze the disparate impact of a binary classification model. We view the model as a fixed channel, and quantify disparate impact as the divergence in output distributions over two groups. Our aim is to find a correction function that can perturb the input distributions of each group to align their output distributions. We present an optimization problem that can be solved to obtain a correction function that will make the output distributions statistically indistinguishable. We derive closed-form expressions to efficiently compute the correction function, and demonstrate the benefits of our framework on a recidivism prediction problem based on the ProPublica COMPAS dataset.


Human-Machine Collaborative Optimization via Apprenticeship Scheduling

arXiv.org Artificial Intelligence

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.


Synthesizing Efficient Solutions for Patrolling Problems in the Internet Environment

arXiv.org Artificial Intelligence

We propose an algorithm for constructing efficient patrolling strategies in the Internet environment, where the protected targets are nodes connected to the network and the patrollers are software agents capable of detecting/preventing undesirable activities on the nodes. The algorithm is based on a novel compositional principle designed for a special class of strategies, and it can quickly construct (sub)optimal solutions even if the number of targets reaches hundreds of millions.