Optimization
Hierarchical Modeling with Tensor Inputs
Zhu, Yada (IBM T.J. Watson Research Center) | He, Jingrui (IBM T.J. Watson Research Center) | Lawrence, Rick (IBM T.J. Watson Research Center)
In many real applications, the input data are naturally expressed as tensors, such as virtual metrology in semiconductor manufacturing, face recognition and gait recognition in computer vision, etc. In this paper, we propose a general optimization framework for dealing with tensor inputs. Most existing methods for supervised tensor learning use only rank-one weight tensors in the linear model and cannot readily incorporate domain knowledge. In our framework, we obtain the weight tensor in a hierarchical way — we first approximate it by a low-rank tensor, and then estimate the low-rank approximation using the prior knowledge from various sources, e.g., different domain experts. This is motivated by wafer quality prediction in semiconductor manufacturing. Furthermore, we propose an effective algorithm named H-MOTE for solving this framework, which is guaranteed to converge. The time complexity of H-MOTE is linear with respect to the number of examples as well as the size of the weight tensor. Experimental results show the superiority of H-MOTE over state-of-the-art techniques on both synthetic and real data sets.
Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process
Chen, Wei (Microsoft Research Asia) | Lu, Wei (University of British Columbia) | Zhang, Ning (University of Science and Technology of China)
Influence maximization is a problem of finding a small set of highly influential users in a social network such that the spread of influence under certain propagation models is maximized. In this paper, we consider time-critical influence maximization, in which one wants to maximize influence spread within a given deadline. Since timing is considered in the optimization, we also extend the Independent Cascade (IC) model to incorporate the time delay aspect of influence diffusion in social networks. We show that time-critical influence maximization under the time-delayed IC model maintains desired properties such as submodularity, which allows a greedy algorithm to achieve an approximation ratio of 1-1/e, to circumvent the NP-hardness of the problem. To overcome the inefficiency of the approximation algorithm, we design two heuristic algorithms: the first one is based on a dynamic programming procedure that computes exact influence in tree structures, while the second one converts the problem to one in the original IC model and then applies existing fast heuristics to it. Our simulation results demonstrate that our heuristics achieve the same level of influence spread as the greedy algorithm while running a few orders of magnitude faster, and they also outperform existing algorithms that disregard the deadline constraint and delays in diffusion.
A Metric Scale for 'Abstractness' of the Word Meaning
Samsonovich, Alexei V. (George Mason University)
Web personalization involves automated content analysis of text, and modern technologies of semantic analysis of text rely on a number of scales. Among them is the abstractness of meaning, which is not captured by more traditional measures of sentiment, such as valence, arousal and dominance. The present work introduces a physics-inspired approach to constructing the abstractness scale based on databases of hypernym-hyponym relations, e.g., WordNet 3.0. The idea is to define an energy as a function of word coordinates that are distributed in one dimension, and then to find a global minimum of this energy function by relocating words in this dimension. The result is a one-dimensional distribution that assigns "abstractness" values to words. While positions of individual words on this scale are subject to noise, the entire distribution globally defines the universal semantic dimension associated with the notion of hypernym-hyponym relations, called here "abstractness".
A Theoretical Framework of the Graph Shift Algorithm
Fan, Xuhui (University of Technology, Sydney) | Cao, Longbing (University of Technology, Sydney)
Since no theoretical foundations for proving the convergence of Graph Shift Algorithm have been reported, we provide a generic framework consisting of three key GS components to fit the Zangwill’s convergence theorem. We show that the sequence set generated by the GS procedures always terminates at a local maximum, or at worst, contains a subsequence which converges to a local maximum of the similarity measure function. What is more, a theoretical framework is proposed to apply our proof to a more general case.
Improving Convergence of CMA-ES Through Structure-Driven Discrete Recombination
Brys, Tim (Vrije Universiteit Brussel) | Nowé, Ann (Vrije Universiteit Brussel)
Evolutionary Strategies (ES) are a class of continuous optimization algorithms that have proven to perform very well on hard optimization problems. Whereas in earlier literature, both intermediate and discrete recombination operators were used, we now see that most ES, e.g. CMA-ES, use only intermediate recombination. While CMA-ES is considered state-of-the-art in continuous optimization, we believe that reintroducing discrete recombination can improve the algorithms' ability to escape local optima. Specifically, we look at using information on the problem's structure to create building blocks for recombination.
Model Learning and Real-Time Tracking Using Multi-Resolution Surfel Maps
Stückler, Jörg (University of Bonn) | Behnke, Sven (University of Bonn)
For interaction with its environment, a robot is required to learn models of objects and to perceive these models in the livestreams from its sensors. In this paper, we propose a novel approach to model learning and real-time tracking. We extract multi-resolution 3D shape and texture representations from RGB-D images at high frame-rates. An efficient variant of the iterative closest points algorithm allows for registering maps in real-time on a CPU. Our approach learns full-view models of objects in a probabilistic optimization framework in which we find the best alignment between multiple views. Finally, we track the pose of the camera with respect to the learned model by registering the current sensor view to the model. We evaluate our approach on RGB-D benchmarks and demonstrate its accuracy, efficiency, and robustness in model learning and tracking. We also report on the successful public demonstration of our approach in a mobile manipulation task.
Efficient Optimization of Control Libraries
Dey, Debadeepta (Carnegie Mellon University) | Liu, Tian Yu (Carnegie Mellon University) | Sofman, Boris (Carnegie Mellon University) | Bagnell, James Andrew (Carnegie Mellon University)
A popular approach to high dimensional control problems in robotics uses a library of candidate “maneuvers” or “trajectories”. The library is either evaluated on a fixed number of candidate choices at runtime (e.g. path set selection for planning) or by iterating through a sequence of feasible choices until success is achieved (e.g. grasp selection). The performance of the library relies heavily on the content and order of the sequence of candidates. We propose a provably efficient method to optimize such libraries, leveraging recent advances in optimizing submodular functions of sequences. This approach is demonstrated on two important problems: mobile robot navigation and manipulator grasp set selection. In the first case, performance can be improved by choosing a subset of candidates which optimizes the metric under consideration (cost of traversal). In the second case, performance can be optimized by minimizing the depth in the list that is searched before a successful candidate is found. Our method can be used in both on-line and batch settings with provable performance guarantees, and can be run in an anytime manner to handle real-time constraints.
Time-Consistency of Optimization Problems
Osogami, Takayuki (IBM Research - Tokyo) | Morimura, Tetsuro (IBM Research - Tokyo)
We study time-consistency of optimization problems, where we say that an optimization problem is time-consistent if its optimal solution, or the optimal policy for choosing actions, does not depend on when the optimization problem is solved. Time-consistency is a minimal requirement on an optimization problem for the decisions made based on its solution to be rational. We show that the return that we can gain by taking "optimal" actions selected by solving a time-inconsistent optimization problem can be surely dominated by that we could gain by taking "suboptimal" actions. We establish sufficient conditions on the objective function and on the constraints for an optimization problem to be time-consistent. We also show when the sufficient conditions are necessary. Our results are relevant in stochastic settings particularly when the objective function is a risk measure other than expectation or when there is a constraint on a risk measure.
Symbolic Dynamic Programming for Continuous State and Action MDPs
Zamani, Zahra (ANU - NICTA The Australian National University National ICT of Australia) | Sanner, Scott (NICTA and ANU) | Fang, Cheng (Department of Aeronautics and Astronautics, MIT)
Many real-world decision-theoretic planning problemsare naturally modeled using both continuous state andaction (CSA) spaces, yet little work has provided ex-act solutions for the case of continuous actions. Inthis work, we propose a symbolic dynamic program-ming (SDP) solution to obtain the optimal closed-formvalue function and policy for CSA-MDPs with mul-tivariate continuous state and actions, discrete noise,piecewise linear dynamics, and piecewise linear (or re-stricted piecewise quadratic) reward. Our key contribu-tion over previous SDP work is to show how the contin-uous action maximization step in the dynamic program-ming backup can be evaluated optimally and symboli-cally — a task which amounts to symbolic constrainedoptimization subject to unknown state parameters; wefurther integrate this technique to work with an efficientand compact data structure for SDP — the extendedalgebraic decision diagram (XADD). We demonstrateempirical results on a didactic nonlinear planning exam-ple and two domains from operations research to showthe first automated exact solution to these problems.
Stochastic Safest and Shortest Path Problems
Teichteil-Königsbuch, Florent (ONERA)
Optimal solutions to Stochastic Shortest Path Problems (SSPs) usually require that there exists at least one policy that reaches the goal with probability 1 from the initial state. This condition is very strong and prevents from solving many interesting problems, for instance where all possible policies reach some dead-end states with a positive probability. We introduce a more general and richer dual optimization criterion, which minimizes the average (undiscounted) cost of only paths leading to the goal among all policies that maximize the probability to reach the goal. We present policy update equations in the form of dynamic programming for this new dual criterion, which are different from the standard Bellman equations, but produce the same solution if there exists one policy leading to the goal with probability 1 from the initial state. We demonstrate that our equations converge in infinite horizon without any condition on the structure of the problem or on its policies, which actually extends the class of SSPs that can be solved. We experimentally show that our dual criterion provides well-founded solutions to SSPs that can not be solved by the standard criterion, and that using a discount factor with the latter certainly provides solution policies but which are not optimal considering our well-founded criterion.