Optimization
Can rationality be measured?
This paper studies whether rationality can be computed. Rationality is defined as the use of complete information, which is processed with a perfect biological or physical brain, in an optimized fashion. To compute rationality one needs to quantify how complete is the information, how perfect is the physical or biological brain and how optimized is the entire decision making system. The rationality of a model (i.e. physical or biological brain) is measured by the expected accuracy of the model. The rationality of the optimization procedure is measured as the ratio of the achieved objective (i.e. utility) to the global objective. The overall rationality of a decision is measured as the product of the rationality of the model and the rationality of the optimization procedure. The conclusion reached is that rationality can be computed for convex optimization problems.
Joint Embedding Learning and Low-Rank Approximation: A Framework for Incomplete Multi-view Learning
Tao, Hong, Hou, Chenping, Yi, Dongyun, Zhu, Jubo
In real-world applications, not all instances in multi-view data are fully represented. To deal with incomplete multi-view data, traditional multi-view algorithms usually throw away the incomplete instances, resulting in loss of available information. To overcome this loss, Incomplete Multi-view Learning (IML) has become a hot research topic. In this paper, we propose a general IML framework for unifying existing IML methods and gaining insight into IML. The proposed framework jointly performs embedding learning and low-rank approximation. Concretely, it approximates the incomplete data by a set of low-rank matrices and learns a full and common embedding by linear transformation. Several existing IML methods can be unified as special cases of the framework. More interestingly, some linear transformation based full-view methods can be adapted to IML directly with the guidance of the framework. This bridges the gap between full multi-view learning and IML. Moreover, the framework can provide guidance for developing new algorithms. For illustration, within the framework, we propose a specific method, termed as Incomplete Multi-view Learning with Block Diagonal Representation (IML-BDR). Based on the assumption that the sampled examples have approximate linear subspace structure, IML-BDR uses the block diagonal structure prior to learn the full embedding, which would lead to more correct clustering. A convergent alternating iterative algorithm with the Successive Over-Relaxation (SOR) optimization technique is devised for optimization. Experimental results on various datasets demonstrate the effectiveness of IML-BDR.
TD-Regularized Actor-Critic Methods
Parisi, Simone, Tangkaratt, Voot, Peters, Jan, Khan, Mohammad Emtiyaz
Actor-critic methods can achieve incredible performance on difficult reinforcement learning problems, but they are also prone to instability. This is partly due to the interaction between the actor and critic during learning, e.g., an inaccurate step taken by one of them might adversely affect the other and destabilize the learning. To avoid such issues, we propose to regularize the learning objective of the actor by penalizing the temporal difference (TD) error of the critic. This improves stability by avoiding large steps in the actor update whenever the critic is highly inaccurate. The resulting method, which we call the TD-regularized actor-critic method, is a simple plug-and-play approach to improve stability and overall performance of the actor-critic methods. Evaluations on standard benchmarks confirm this.
Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing
Sadeghi, Alireza, Sheikholeslami, Fatemeh, Marques, Antonio G., Giannakis, Georgios B.
Small base stations (SBs) of fifth-generation (5G) cellular networks are envisioned to have storage devices to locally serve requests for reusable and popular contents by \emph{caching} them at the edge of the network, close to the end users. The ultimate goal is to shift part of the predictable load on the back-haul links, from on-peak to off-peak periods, contributing to a better overall network performance and service experience. To enable the SBs with efficient \textit{fetch-cache} decision-making schemes operating in dynamic settings, this paper introduces simple but flexible generic time-varying fetching and caching costs, which are then used to formulate a constrained minimization of the aggregate cost across files and time. Since caching decisions per time slot influence the content availability in future slots, the novel formulation for optimal fetch-cache decisions falls into the class of dynamic programming. Under this generic formulation, first by considering stationary distributions for the costs and file popularities, an efficient reinforcement learning-based solver known as value iteration algorithm can be used to solve the emerging optimization problem. Later, it is shown that practical limitations on cache capacity can be handled using a particular instance of the generic dynamic pricing formulation. Under this setting, to provide a light-weight online solver for the corresponding optimization, the well-known reinforcement learning algorithm, $Q$-learning, is employed to find optimal fetch-cache decisions. Numerical tests corroborating the merits of the proposed approach wrap up the paper.
Low-rank Approximation of Linear Maps
This work provides closed-form solutions and minimal achie vable errors for a large class of low-rank approximation problems in Hilbert spaces . The proposed theorem generalizes to the case of linear bounded operators andp-th Schatten norms previous results obtained in the finite dimensional case for the Frobenius norm. The theorem is illu strated in various settings, including low-rank approximation problems with respect to the trace n orm, the 2-induced norm or the Hilbert-Schmidt norm. The theorem provides also the basics for the de sign of tractable algorithms for kernel-based or continuous DMD.
Primal path algorithm for compositional data analysis
Jeon, Jong-June, Kim, Yongdai, Won, Sungho, Choi, Hosik
In modern regression analysis, it is frequently observed that regression predictors consist of the proportions or relative ratios of certain values rather than absolute values. For example, in analyzing air pollution data, the percentages of chemicals in the air are considered relevant predictors to identify the source of a pollutant (Lee et al., 2007). These types of proportional data, typically called compositional data, are widely used in geoscience (Buccianti et al., 2006), microbiology (Montassier et al., 2016), and nutritional biochemistry (Leite, 2016). By the definition of compositional data, all compositional predictors lie on the simplex and are thus linearly dependent. Aitchison and Bacon-shone (1984) proposed a regression model for compositional data as follows.
A Primal-dual Learning Algorithm for Personalized Dynamic Pricing with an Inventory Constraint
Chen, Ningyuan, Gallego, Guillermo
A firm is selling a product to different types (based on the features such as education backgrounds, ages, etc.) of customers over a finite season with non-replenishable initial inventory. The type label of an arriving customer can be observed but the demand function associated with each type is initially unknown. The firm sets personalized prices dynamically for each type and attempts to maximize the revenue over the season. We provide a learning algorithm that is near-optimal when the demand and capacity scale in proportion. The algorithm utilizes the primal-dual formulation of the problem and learns the dual optimal solution explicitly. It allows the algorithm to overcome the curse of dimensionality (the rate of regret is independent of the number of types) and sheds light on novel algorithmic designs for learning problems with resource constraints.
Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames
Robin, Geneviรจve, Wai, Hoi-To, Josse, Julie, Klopp, Olga, Moulines, รric
Many applications of machine learning involve the analysis of large data frames-matrices collecting heterogeneous measurements (binary, numerical, counts, etc.) across samples-with missing values. Low-rank models, as studied by Udell et al. [30], are popular in this framework for tasks such as visualization, clustering and missing value imputation. Yet, available methods with statistical guarantees and efficient optimization do not allow explicit modeling of main additive effects such as row and column, or covariate effects. In this paper, we introduce a low-rank interaction and sparse additive effects (LORIS) model which combines matrix regression on a dictionary and low-rank design, to estimate main effects and interactions simultaneously. We provide statistical guarantees in the form of upper bounds on the estimation error of both components. Then, we introduce a mixed coordinate gradient descent (MCGD) method which provably converges sub-linearly to an optimal solution and is computationally efficient for large scale data sets. We show on simulated and survey data that the method has a clear advantage over current practices, which consist in dealing separately with additive effects in a preprocessing step.
Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems
Malik, Dhruv, Pananjady, Ashwin, Bhatia, Kush, Khamaru, Koulik, Bartlett, Peter L., Wainwright, Martin J.
We study derivative-free methods for policy optimization over the class of linear policies. We focus on characterizing the convergence rate of these methods when applied to linear-quadratic systems, and study various settings of driving noise and reward feedback. We show that these methods provably converge to within any pre-specified tolerance of the optimal policy with a number of zero-order evaluations that is an explicit polynomial of the error tolerance, dimension, and curvature properties of the problem. Our analysis reveals some interesting differences between the settings of additive driving noise and random initialization, as well as the settings of one-point and two-point reward feedback. Our theory is corroborated by extensive simulations of derivative-free methods on these systems. Along the way, we derive convergence rates for stochastic zero-order optimization algorithms when applied to a certain class of non-convex problems.
A Novel Large-scale Ordinal Regression Model
Shi, Yong, Wang, Huadong, Shen, Xin, Niu, Lingfeng
Ordinal regression (OR) is a special multiclass classification problem where an order relation exists among the labels. Recent years, people share their opinions and sentimental judgments conveniently with social networks and E-Commerce so that plentiful large-scale OR problems arise. However, few studies have focused on this kind of problems. Nonparallel Support Vector Ordinal Regression (NPSVOR) is a SVM-based OR model, which learns a hyperplane for each rank by solving a series of independent sub-optimization problems and then ensembles those learned hyperplanes to predict. The previous studies are focused on its nonlinear case and got a competitive testing performance, but its training is time consuming, particularly for large-scale data. In this paper, we consider NPSVOR's linear case and design an efficient training method based on the dual coordinate descent method (DCD). To utilize the order information among labels in prediction, a new prediction function is also proposed. Extensive contrast experiments on the text OR datasets indicate that the carefully implemented DCD is very suitable for training large data.