Optimization
Clustering Markov Decision Processes For Continual Transfer
Mahmud, M. M. Hassan, Hawasly, Majd, Rosman, Benjamin, Ramamoorthy, Subramanian
We present algorithms to effectively represent a set of Markov decision processes (MDPs), whose optimal policies have already been learned, by a smaller source subset for lifelong, policy-reuse-based transfer learning in reinforcement learning. This is necessary when the number of previous tasks is large and the cost of measuring similarity counteracts the benefit of transfer. The source subset forms an `$\epsilon$-net' over the original set of MDPs, in the sense that for each previous MDP $M_p$, there is a source $M^s$ whose optimal policy has $<\epsilon$ regret in $M_p$. Our contributions are as follows. We present EXP-3-Transfer, a principled policy-reuse algorithm that optimally reuses a given source policy set when learning for a new MDP. We present a framework to cluster the previous MDPs to extract a source subset. The framework consists of (i) a distance $d_V$ over MDPs to measure policy-based similarity between MDPs; (ii) a cost function $g(\cdot)$ that uses $d_V$ to measure how good a particular clustering is for generating useful source tasks for EXP-3-Transfer and (iii) a provably convergent algorithm, MHAV, for finding the optimal clustering. We validate our algorithms through experiments in a surveillance domain.
Computational Cost Reduction in Learned Transform Classifications
Machado, Emerson Lopes, Miosso, Cristiano Jacques, von Borries, Ricardo, Coutinho, Murilo, Berger, Pedro de Azevedo, Marques, Thiago, Jacobi, Ricardo Pezzuol
We present a theoretical analysis and empirical evaluations of a novel set of techniques for computational cost reduction of classifiers that are based on learned transform and soft-threshold. By modifying optimization procedures for dictionary and classifier training, as well as the resulting dictionary entries, our techniques allow to reduce the bit precision and to replace each floating-point multiplication by a single integer bit shift. We also show how the optimization algorithms in some dictionary training methods can be modified to penalize higher-energy dictionaries. We applied our techniques with the classifier Learning Algorithm for Soft-Thresholding, testing on the datasets used in its original paper. Our results indicate it is feasible to use solely sums and bit shifts of integers to classify at test time with a limited reduction of the classification accuracy. These low power operations are a valuable trade off in FPGA implementations as they increase the classification throughput while decrease both energy consumption and manufacturing cost.
A Smart Database for a New Age of Enterprise Apps - DATAVERSITY
Do you remember what it was like the first time you got your hands on an iPhone? When you realized that all the things that you used to have to do on separate devices now could be accomplished on one single device? Well, the minds behind LogicBlox would like you to feel the same way about its foundational technology that collapses multiple technology stacks into a unified smart database environment that aims to enable enterprises to create sophisticated and easily iterated applications in one place. Transaction and analytics co-exist in the platform, with the system utilizing a single declarative language with extensions โ such as Machine Learning capabilities and statistical relational models โ to support prescriptive and predictive analytics. Users can leverage LogicBlox' full-blown database functionality to train Machine Learning models, for use in solving forecasting or optimization problems, for instance.
A Probabilistic Adaptive Search System for Exploring the Face Space
Abad, Andres G., Castro, Luis I. Reyes
Face recall is a basic human cognitive process performed routinely, e.g., when meeting someone and determining if we have met that person before. Assisting a subject during face recall by suggesting candidate faces can be challenging. One of the reasons is that the search space - the face space - is quite large and lacks structure. A commercial application of face recall is facial composite systems - such as Identikit, PhotoFIT, and CD-FIT - where a witness searches for an image of a face that resembles his memory of a particular offender. The inherent uncertainty and cost in the evaluation of the objective function, the large size and lack of structure of the search space, and the unavailability of the gradient concept makes this problem inappropriate for traditional optimization methods. In this paper we propose a novel evolutionary approach for searching the face space that can be used as a facial composite system. The approach is inspired by methods of Bayesian optimization and differs from other applications in the use of the skew-normal distribution as its acquisition function. This choice of acquisition function provides greater granularity, with regularized, conservative, and realistic results.
Sequential Bayesian optimal experimental design via approximate dynamic programming
Huan, Xun, Marzouk, Youssef M.
The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.
Influence Maximization with Bandits
Vaswani, Sharan, Lakshmanan, Laks. V. S., Schmidt, Mark
We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to. Most prior work on this topic assumes that we know the probability of each user influencing each other user, or we have data that lets us estimate these influences. However, this information is typically not initially available or is difficult to obtain. To avoid this assumption, we adopt a combinatorial multi-armed bandit paradigm that estimates the influence probabilities as we sequentially try different seed sets. We establish bounds on the performance of this procedure under the existing edge-level feedback as well as a novel and more realistic node-level feedback. Beyond our theoretical results, we describe a practical implementation and experimentally demonstrate its efficiency and effectiveness on four real datasets.
Optimization as Estimation with Gaussian Processes in Bandit Settings
Wang, Zi, Zhou, Bolei, Jegelka, Stefanie
Recently, there has been rising interest in Bayesian optimization -- the optimization of an unknown function with assumptions usually expressed by a Gaussian Process (GP) prior. We study an optimization strategy that directly uses an estimate of the argmax of the function. This strategy offers both practical and theoretical advantages: no tradeoff parameter needs to be selected, and, moreover, we establish close connections to the popular GP-UCB and GP-PI strategies. Our approach can be understood as automatically and adaptively trading off exploration and exploitation in GP-UCB and GP-PI. We illustrate the effects of this adaptive tuning via bounds on the regret as well as an extensive empirical evaluation on robotics and vision tasks, demonstrating the robustness of this strategy for a range of performance criteria.
Learning Concept Graphs from Online Educational Data
Liu, Hanxiao, Ma, Wanli, Yang, Yiming, Carbonell, Jaime
This paper addresses an open challenge in educational data mining, i.e., the problem of automatically mapping online courses from different providers (universities, MOOCs, etc.) onto a universal space of concepts, and predicting latent prerequisite dependencies (directed links) among both concepts and courses. We propose a novel approach for inference within and across course-level and concept-level directed graphs. In the training phase, our system projects partially observed course-level prerequisite links onto directed concept-level links; in the testing phase, the induced concept-level links are used to infer the unknown course-level prerequisite links. Whereas courses may be specific to one institution, concepts are shared across different providers. The bi-directional mappings enable our system to perform interlingua-style transfer learning, e.g. treating the concept graph as the interlingua and transferring the prerequisite relations across universities via the interlingua. Experiments on our newly collected datasets of courses from MIT, Caltech, Princeton and CMU show promising results.
When Are Nonconvex Problems Not Scary?
Sun, Ju, Qu, Qing, Wright, John
In this note, we focus on smooth nonconvex optimization problems that obey: (1) all local minimizers are also global; and (2) around any saddle point or local maximizer, the objective has a negative directional curvature. Concrete applications such as dictionary learning, generalized phase retrieval, and orthogonal tensor decomposition are known to induce such structures. We describe a second-order trust-region algorithm that provably converges to a global minimizer efficiently, without special initializations. Finally we highlight alternatives, and open problems in this direction.
Fast Proximal Linearized Alternating Direction Method of Multiplier with Parallel Splitting
Lu, Canyi (National University of Singapore) | Li, Huan ( Peking University ) | Lin, Zhouchen ( Peking University ) | Yan, Shuicheng ( National University of Singapore )
The Augmented Lagragian Method (ALM) and Alternating Direction Method of Multiplier (ADMM) have been powerful optimization methods for general convex programming subject to linear constraint. We consider the convex problem whose objective consists of a smooth part and a nonsmooth but simple part. We propose the Fast Proximal Augmented Lagragian Method (Fast PALM) which achieves the convergence rate O(1/K2), compared with O(1/K) by the traditional PALM. In order to further reduce the per-iteration complexity and handle the multi-blocks problem, we propose the Fast Proximal ADMM with Parallel Splitting (Fast PL-ADMM-PS) method. It also partially improves the rate related to the smooth part of the objective function. Experimental results on both synthesized and real world data demonstrate that our fast methods significantly improve the previous PALM and ADMM