Statistical Learning
Learning Deep Representations for Graph Clustering
Tian, Fei (University of Science and Technology of China) | Gao, Bin (Microsoft Research) | Cui, Qing (Tsinghua University) | Chen, Enhong (University of Science and Technology of China) | Liu, Tie-Yan (Microsoft Research)
Recently deep learning has been successfully adopted in many applications such as speech recognition and image classification. In this work, we explore the possibility of employing deep learning in graph clustering. We propose a simple method, which first learns a nonlinear embedding of the original graph by stacked autoencoder, and then runs $k$-means algorithm on the embedding to obtain the clustering result. We show that this simple method has solid theoretical foundation, due to the similarity between autoencoder and spectral clustering in terms of what they actually optimize. Then, we demonstrate that the proposed method is more efficient and flexible than spectral clustering. First, the computational complexity of autoencoder is much lower than spectral clustering: the former can be linear to the number of nodes in a sparse graph while the latter is super quadratic due to eigenvalue decomposition. Second, when additional sparsity constraint is imposed, we can simply employ the sparse autoencoder developed in the literature of deep learning; however, it is non-straightforward to implement a sparse spectral method. The experimental results on various graph datasets show that the proposed method significantly outperforms conventional spectral clustering which clearly indicates the effectiveness of deep learning in graph clustering.
Generalized Higher-Order Tensor Decomposition via Parallel ADMM
Shang, Fanhua (The Chinese University of Hong Kong) | Liu, Yuanyuan (The Chinese University of Hong Kong) | Cheng, James (The Chinese University of Hong Kong)
Higher-order tensors are becoming prevalent in many scientific areas such as computer vision, social network analysis, data mining and neuroscience. Traditional tensor decomposition approaches face three major challenges: model selecting, gross corruptions and computational efficiency. To address these problems, we first propose a parallel trace norm regularized tensor decomposition method, and formulate it as a convex optimization problem. This mehtod does not require the rank of each mode to be specified beforehand, and can automaticaly determine the number of factors in each mode through our optimization scheme. By considering the low-rank structure of the observed tensor, we analyze the equivalent relationship of the trace norm between a low-rank tensor and its core tensor. Then, we cast a non-convex tensor decomposition model into a weighted combination of multiple much smaller-scale matrix trace norm minimization. Finally, we develop two parallel alternating direction methods of multipliers (ADMM) to solve our problems. Experimental results verify that our regularized formulation is effective, and our methods are robust to noise or outliers.
Discovering Better AAAI Keywords via Clustering with Community-Sourced Constraints
Moran, Kelly H. (Google Inc.) | Wallace, Byron C. (Brown University) | Brodley, Carla E. (Tufts University)
Selecting good conference keywords is important because they often determine the composition of review committees and hence which papers are reviewed by whom. But presently conference keywords are generated in an ad-hoc manner by a small set of conference organizers. This approach is plainly not ideal. There is no guarantee, for example, that the generated keyword set aligns with what the community is actually working on and submitting to the conference in a given year. This is especially true in fast moving fields such as AI. The problem is exacerbated by the tendency of organizers to draw heavily on preceding years' keyword lists when generating a new set. Rather than a select few ordaining a keyword set that that represents AI at large, it would be preferable to generate these keywords more directly from the data, with input from research community members. To this end, we solicited feedback from seven AAAI PC members regarding a previously existing keyword set and used these 'community-sourced constraints' to inform a clustering over the abstracts of all submissions to AAAI 2013. We show that the keywords discovered via this data-driven, human-in-the-loop method are at least as preferred (by AAAI PC members) as 2013's manually generated set, and that they include categories previously overlooked by organizers. Many of the discovered terms were used for this year's conference.
Automatic Construction and Natural-Language Description of Nonparametric Regression Models
Lloyd, James Robert (University of Cambridge) | Duvenaud, David (University of Cambridge) | Grosse, Roger (Massachusetts Institute of Technology) | Tenenbaum, Joshua (Massachusetts Institute of Technology) | Ghahramani, Zoubin (University of Cambridge)
This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural-language text. Our approach treats unknown regression functions nonparametrically using Gaussian processes, which has two important consequences. First, Gaussian processes can model functions in terms of high-level properties (e.g. smoothness, trends, periodicity, changepoints). Taken together with the compositional structure of our language of models this allows us to automatically describe functions in simple terms. Second, the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state-of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains.
Ranking Tweets by Labeled and Collaboratively Selected Pairs with Transitive Closure
Liu, Shenghua (Chinese Academy of Sciences) | Cheng, Xueqi (Chinese Academy of Sciences) | Li, Fangtao (Google Inc.)
Tweets ranking is important for information acquisition in Microblog. Due to the content sparsity and lackof labeled data, it is better to employ semi-supervisedlearning methods to utilize the unlabeled data. However,most of previous semi-supervised learning methods donot consider the pair conflict problem, which means thatthe new selected unlabeled data may conflict with the labeled and previously selected data. It will hurt the learning performance a lot, if the training data contains manyconflict pairs. In this paper, we propose a new collaborative semi-supervised SVM ranking model (CSR-TC)with consideration of the order conflict. The unlabeleddata is selected based on a dynamically maintained transitive closure graph to avoid pair conflict. We also investigate the two views of features, intrinsic and contentrelevant features, for the proposed model. Extensive experiments are conducted on TREC Microblogging corpus. The results demonstrate that our proposed methodachieves significant improvement, compared to severalstate-of-the-art models.
Low-Rank Tensor Learning with Discriminant Analysis for Action Classification and Image Recovery
Jia, Chengcheng (Northeastern University) | Zhong, Guoqiang (Ocean University of China) | Fu, Yun (Northeastern University)
Tensor completion is an important topic in the area of image processing and computer vision research, which is generally built on extraction of the intrinsic structure of the tensor data. Drawing on this fact, action classification, relying heavily on the extracted features of high-dimensional tensors, may indeed benefit from tensor completion techniques. In this paper, we propose a low-rank tensor completion method for action classification, as well as image recovery. Since there may exist distortion and corruption in the tensor representations of video sequences, we project the tensors into a subspace, which contains the invariant structure of the tensors. In order to integrate useful supervisory information for classification, we adopt a discriminant analysis criterion to learn the projection matrices. The resulting multi-variate optimization problem can be effectively solved using the augmented Lagrange multiplier (ALM) algorithm. Experiments demonstrate that our method results with better accuracy compared with some other state-of-the-art low-rank tensor representation learning approaches on the MSR Hand Gesture 3D database and the MSR Action 3D database. By denoising the Multi-PIE face database, our experimental setup testifies the proposed method can also be employed to recover images.
On the Challenges of Physical Implementations of RBMs
Dumoulin, Vincent (Universitรฉ de Montrรฉal) | Goodfellow, Ian J (Universitรฉ de Montrรฉal) | Courville, Aaron (Universitรฉ de Montrรฉal) | Bengio, Yoshua (Universitรฉ de Montrรฉal)
Restricted Boltzmann machines (RBMs) are powerful machine learning models, but learning and some kinds of inference in the model require sampling-based approximations, which, in classical digital computers, are implemented using expensive MCMC. Physical computation offers the opportunity to reduce the costof sampling by building physical systems whose natural dynamics correspond to drawing samples from the desired RBM distribution. Such a system avoids the burn-in and mixing cost of a Markov chain. However, hardware implementations of this variety usually entail limitations such as low-precision and limited range of the parameters and restrictions on the size and topology of the RBM. We conduct software simulations to determine how harmful each of these restrictions is. Our simulations are based on the D-Wave Two computer, but the issues we investigate arise in most forms of physical computation.Our findings suggest that designers of new physical computing hardware and algorithms for physical computers should focus their efforts on overcoming the limitations imposed by the topology restrictions of currently existing physical computers.
A Convex Formulation for Semi-Supervised Multi-Label Feature Selection
Chang, Xiaojun (The University of Queensland) | Nie, Feiping (University of Texas at Arlington) | Yang, Yi (The University of Queensland) | Huang, Heng (University of Texas at Arlington)
Explosive growth of multimedia data has brought challenge of how to efficiently browse, retrieve and organize these data. Under this circumstance, different approaches have been proposed to facilitate multimedia analysis. Several semi-supervised feature selection algorithms have been proposed to exploit both labeled and unlabeled data. However, they are implemented based on graphs, such that they cannot handle large-scale datasets. How to conduct semi-supervised feature selection on large-scale datasets has become a challenging research problem. Moreover, existing multi-label feature selection algorithms rely on eigen-decomposition with heavy computational burden, which further prevent current feature selection algorithms from being applied for big data. In this paper, we propose a novel convex semi-supervised multi-label feature selection algorithm, which can be applied to large-scale datasets. We evaluate performance of the proposed algorithm over five benchmark datasets and compare the results with state-of-the-art supervised and semi-supervised feature selection algorithms as well as baseline using all features. The experimental results demonstrate that our proposed algorithm consistently achieve superiors performances.
A Spatially Sensitive Kernel to Predict Cognitive Performance from Short-Term Changes in Neural Structure
Ansari, M. Hidayath (University of Wisconsin-Madison) | Coen, Michael H. (University of Wisconsin-Madison) | Bendlin, Barbara B (University of Wisconsin-Madison) | Sager, Mark A (University of Wisconsin-Madison) | Johnson, Sterling C (University of Wisconsin-Madison)
This paper introduces a novel framework for performing machine learning onlongitudinal neuroimaging datasets. These datasets are characterized by theirsize, particularly their width (millions of features per data input). Specifically, we address the problem of detecting subtle, short-term changes inneural structure that are indicative of cognitive change and correlate withrisk factors for Alzheimer's disease. We introduce a new spatially-sensitivekernel that allows us to reason about individuals, as opposed to populations. In doing so, this paper presents the first evidence demonstrating that verysmall changes in white matter structure over a two year period can predictchange in cognitive function in healthy adults.
Role-Aware Conformity Modeling and Analysis in Social Networks
Zhang, Jing (Tsinghua University) | Tang, Jie (Tsinghua University) | Zhuang, Honglei (Universtiy of Illinois at Urbana-Champaign) | Leung, Cane Wing-Ki (Huawei Noah's Ark Lab) | Li, Juanzi (Tsinghua University)
Conformity is the inclination of a person to be influenced by others. In this paper, we study how the conformity tendency of a person changes with her role, as defined by her structural properties in a social network. We first formalize conformity using a utility function based on the conformity theory from social psychology, and validate the proposed utility function by proving the existence of Nash Equilibria when all users in a network behave according to it. We then extend and incorporate the utility function into a probabilistic topic model, called the Role-Conformity Model (RCM), for modeling user behaviors under the effect of conformity. We apply the proposed RCM to several academic research networks, and discover that people with higher degree and lower clustering coefficient are more likely to conform to others. We also evaluate RCM through the task of word usage prediction in academic publications, and show significant improvements over baseline models.