Asia
Strong Temporal Planning with Uncontrollable Durations: A State-Space Approach
Cimatti, Alessandro (Fondazione Bruno Kessler) | Micheli, Andrea (Fondazione Bruno Kessler) | Roveri, Marco (Fondazione Bruno Kesslerr)
In many practical domains, planning systems are required to reason about durative actions. A common assumption in the literature is that the executor is allowed to decide the duration of each action. However, this assumption may be too restrictive for applications. In this paper, we tackle the problem of temporal planning with uncontrollable action durations. We show how to generate robust plans,that guarantee goal achievement despite the uncontrollability of the actual duration of the actions. We extend the state-space temporalplanning framework, integrating recent techniques for solving temporalproblems under uncertainty. We discuss different ways of lifting the total order plans generated by the heuristic search to partial orderplans, showing (in)completeness results for each of them. We implemented our approach on top of COLIN, a state-of-the-art planner. An experimental evaluation over several benchmark problems shows the practical feasibility of the proposed approach.
Some Fixed Parameter Tractability Results for Planning with Non-Acyclic Domain-Transition Graphs
Bรคckstrรถm, Christer (Linkรถping University, Linkรถping, Sweden)
Bรคckstrรถm studied the parameterised complexity of planning when the domain-transition graphs (DTGs) are acyclic. He used the parameters d (domain size), k (number of paths in the DTGs) and w (treewidth of the causal graph), and showed that planning is fixed-parameter tractable (fpt) in these parameters, and fpt in only parameter k if the causal graph is a polytree. We continue this work by considering some additional cases of non-acyclic DTGs. In particular, we consider the case where each strongly connected component (SCC) in a DTG must be a simple cycle, and we show that planning is fpt for this case if the causal graph is a polytree. This is done by first preprocessing the instance to construct an equivalent abstraction and then apply Bรคckstrรถms technique to this abstraction. We use the parameters d and k , reinterpreting this as the number of paths in the condensation of a DTG, and the two new parameters c (the number of contracted cycles along a path) and p max (an upper bound for walking around cycles, when not unbounded).
10,000+ Times Accelerated Robust Subset Selection
Zhu, Feiyun (Institute of Automation, Chinese Academy of Sciences) | Fan, Bin (Institute of Automation, Chinese Academy of Sciences) | Zhu, Xinliang (Institute of Automation, Chinese Academy of Sciences) | Wang, Ying (Institute of Automation, Chinese Academy of Sciences) | Xiang, Shiming (Institute of Automation, Chinese Academy of Sciences) | Pan, Chunhong (Institute of Automation, Chinese Academy of Sciences)
Subset selection from massive data with noised information is increasingly popular for various applications. This problem is still highly challenging as current methods are generally slow in speed and sensitive to outliers. To address the above two issues, we propose an accelerated robust subset selection (ARSS) method. Extensive experiments on ten benchmark datasets verify that our method not only outperforms state of the art methods, but also runs 10,000+ times faster than the most related method.
A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing
Zhou, Quan (Tsinghua University) | Chen, Wenlin (Washington University in St. Louis) | Song, Shiji (Tsinghua University) | Gardner, Jacob R. (Washington University in St. Louis) | Weinberger, Kilian Q. (Washington University in St. Louis) | Chen, Yixin (Washington University in St. Louis)
Algorithmic reductions are one of the corner stones of theoretical computer science. Surprisingly, to-date, they have only played a limited role in machine learning. In this paper we introduce a formal and practical reduction between two of the most widely used machine learning algorithms: from the Elastic Net (and the Lasso as a special case) to the Support Vector Machine. First, we derive the reduction and summarize it in only 11 lines of MATLAB. Then, we demonstrate its high impact potential by translating recent advances in parallelizing SVM solvers directly to the Elastic Net. The resulting algorithm is a parallel solver for the Elastic Net (and Lasso) that naturally utilizes GPU and multi-core CPUs. We evaluate it on twelve real world data sets, and show that it yields identical results as the popular (and highly optimized) glmnet implementation but is up-to two orders of magnitude faster.
Self-Paced Learning for Matrix Factorization
Zhao, Qian (Xi'an Jiaotong University) | Meng, Deyu (Xi'an Jiaotong University) | Jiang, Lu (Carnegie Mellon University) | Xie, Qi (Xi'an Jiaotong University) | Xu, Zongben (Xi'an Jiaotong University) | Hauptmann, Alexander G. (Carnegie Mellon University)
Matrix factorization (MF) has been attracting much attention due to its wide applications. However, since MF models are generally non-convex, most of the existing methods are easily stuck into bad local minima, especially in the presence of outliers and missing data. To alleviate this deficiency, in this study we present a new MF learning methodology by gradually including matrix elements into MF training from easy to complex. This corresponds to a recently proposed learning fashion called self-paced learning (SPL), which has been demonstrated to be beneficial in avoiding bad local minima. We also generalize the conventional binary (hard) weighting scheme for SPL to a more effective real-valued (soft) weighting manner. The effectiveness of the proposed self-paced MF method is substantiated by a series of experiments on synthetic, structure from motion and background subtraction data.
SoF: Soft-Cluster Matrix Factorization for Probabilistic Clustering
Zhao, Han (University of Waterloo) | Poupart, Pascal (University of Waterloo) | Zhang, Yongfeng (Tsinghua University) | Lysy, Martin (University of Waterloo)
We propose SoF (Soft-cluster matrix Factorization), a probabilistic clustering algorithm which softly assigns each data point into clusters. Unlike model-based clustering algorithms, SoF does not make assumptions about the data density distribution. Instead, we take an axiomatic approach to define 4 properties that the probability of co-clustered pairs of points should satisfy. Based on the properties, SoF utilizes a distance measure between pairs of points to induce the conditional co-cluster probabilities. The objective function in our framework establishes an important connection between probabilistic clustering and constrained symmetric Nonnegative Matrix Factorization (NMF), hence providing a theoretical interpretation for NMF-based clustering algorithms. To optimize the objective, we derive a sequential minimization algorithm using a penalty method. Experimental results on both synthetic and real-world datasets show that SoF significantly outperforms previous NMF-based algorithms and that it is able to detect non-convex patterns as well as cluster boundaries.
Multi-Task Learning and Algorithmic Stability
Zhang, Yu (Hong Kong Baptist University)
In this paper, we study multi-task algorithms from the perspective of the algorithmic stability. We give a definition of the multi-task uniform stability, a generalization of the conventional uniform stability, which measures the maximum difference between the loss of a multi-task algorithm trained on a data set and that of the multi-task algorithm trained on the same data set but with a data point removed in each task. In order to analyze multi-task algorithms based on multi-task uniform stability, we prove a generalized McDiarmid's inequality which assumes the difference bound condition holds by changing multiple input arguments instead of only one in the conventional McDiarmid's inequality. By using the generalized McDiarmid's inequality as a tool, we can analyze the generalization performance of general multi-task algorithms in terms of the multi-task uniform stability. Moreover, as applications, we prove generalization bounds of several representative regularized multi-task algorithms.
Constrained NMF-Based Multi-View Clustering on Unmapped Data
Zhang, Xianchao (Dalian University of Technology) | Zong, Linlin (Dalian University of Technology) | Liu, Xinyue (Dalian University of Technology) | Yu, Hong (Dalian University of Technology)
We use the disagreement between the Multi-view clustering gains increasing attention in the past views to guide the factorization of the matrices. The overall decade (Bickel and Scheffer 2004) (Kumar and III 2011) objective of our algorithm is to minimize the loss function of (Kumar, Rai, and III 2011) (Liu et al. 2013) (Blaschko and NMF in each view as well as the disagreement between each Lampert 2008) (Chaudhuri et al. 2009) (Tzortzis and Likas pair of views. Experimental results show that, with a small 2012). Most existing multi-view clustering algorithms require number of constraints, the proposed CMVNMF (Constrained that the data is completely mapped, i.e., every object Multi-View clustering based on NMF) algorithm gets good has representations in all the views, representations from different performance on unmapped data, and outperforms existing views representing a same object are exactly known, algorithms on partially mapped data and completely mapped and the representations of the same object have the same data.
Online Dictionary Learning on Symmetric Positive Definite Manifolds with Vision Applications
Zhang, Shengping (Harbin Institute of Technology at Weihai) | Kasiviswanathan, Shiva (General Electric Global Research) | Yuen, Pong C. (Hong Kong Baptist University) | Harandi, Mehrtash (NICTA and Australian National University)
Symmetric Positive Definite (SPD) matrices in the form of region covariances are considered rich descriptors for images and videos. Recent studies suggest that exploiting the Riemannian geometry of the SPD manifolds could lead to improved performances for vision applications. For tasks involving processing large-scale and dynamic data in computer vision, the underlying model is required to progressively and efficiently adapt itself to the new and unseen observations. Motivated by these requirements, this paper studies the problem of online dictionary learning on the SPD manifolds. We make use of the Stein divergence to recast the problem of online dictionary learning on the manifolds to a problem in Reproducing Kernel Hilbert Spaces, for which, we develop efficient algorithms by taking into account the geometric structure of the SPD manifolds. To our best knowledge, our work is the first study that provides a solution for online dictionary learning on the SPD manifolds. Empirical results on both large-scale image classification task and dynamic video processing tasks validate the superior performance of our approach as compared to several state-of-the-art algorithms.
Online Bandit Learning for a Special Class of Non-Convex Losses
Zhang, Lijun (Nanjing University) | Yang, Tianbao (The University of Iowa) | Jin, Rong (Michigan State University) | Zhou, Zhi-Hua (Nanjing University)
In online bandit learning, the learner aims to minimize a sequence of losses, while only observing the value of each loss at a single point. Although various algorithms and theories have been developed for online bandit learning, most of them are limited to convex losses. In this paper, we investigate the problem of online bandit learning with non-convex losses, and develop an efficient algorithm with formal theoretical guarantees. To be specific, we consider a class of losses which is a composition of a non-increasing scalar function and a linear function. This setting models a wide range of supervised learning applications such as online classification with a non-convex loss. Theoretical analysis shows that our algorithm achieves an O(poly(d)T2/3) regret bound when the variation of the loss function is small. To the best of our knowledge, this is the first work in online bandit learning that does not rely on convexity.