Wang, Li
A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization
Wang, Li, Yao, Junlin, Tao, Yunzhe, Zhong, Li, Liu, Wei, Du, Qiang
In this paper, we propose a deep learning approach to tackle the automatic summarization tasks by incorporating topic information into the convolutional sequence-to-sequence (ConvS2S) model and using self-critical sequence training (SCST) for optimization. Through jointly attending to topics and word-level alignment, our approach can improve coherence, diversity, and informativeness of generated summaries via a biased probability generation mechanism. On the other hand, reinforcement training, like SCST, directly optimizes the proposed model with respect to the non-differentiable metric ROUGE, which also avoids the exposure bias during inference. We carry out the experimental evaluation with state-of-the-art methods over the Gigaword, DUC-2004, and LCSTS datasets. The empirical results demonstrate the superiority of our proposed method in the abstractive summarization.
An Optimal Rewiring Strategy for Reinforcement Social Learning in Cooperative Multiagent Systems
Tang, Hongyao, Wang, Li, Wang, Zan, Baarslag, Tim, Hao, Jianye
Multiagent coordination in cooperative multiagent systems (MASs) has been widely studied in both fixed-agent repeated interaction setting and the static social learning framework. However, two aspects of dynamics in real-world multiagent scenarios are currently missing in existing works. First, the network topologies can be dynamic where agents may change their connections through rewiring during the course of interactions. Second, the game matrix between each pair of agents may not be static and usually not known as a prior. Both the network dynamic and game uncertainty increase the coordination difficulty among agents. In this paper, we consider a multiagent dynamic social learning environment in which each agent can choose to rewire potential partners and interact with randomly chosen neighbors in each round. We propose an optimal rewiring strategy for agents to select most beneficial peers to interact with for the purpose of maximizing the accumulated payoff in repeated interactions. We empirically demonstrate the effectiveness and robustness of our approach through comparing with benchmark strategies. The performance of three representative learning strategies under our social learning framework with our optimal rewiring is investigated as well.
Latent Smooth Skeleton Embedding
Wang, Li (University of Illinois at Chicago) | Mao, Qi (HERE Company) | Tsang, Ivor W. (University of Technoloy Sydney)
Learning a smooth skeleton in a low-dimensional space from noisy data becomes important in computer vision and computational biology. Existing methods assume that the manifold constructed from the data is smooth, but they lack the ability to model skeleton structures from noisy data. To overcome this issue, we propose a novel probabilistic structured learning model to learn the density of latent embedding given high-dimensional data and its neighborhood graph. The embedded points that form a smooth skeleton structure are obtained by maximum a posteriori (MAP) estimation. Our analysis shows that the resulting similarity matrix is sparse and unique, and its associated kernel has eigenvalues that follow a power law distribution, which leads to the embeddings of a smooth skeleton. The model is extended to learn a sparse similarity matrix when the graph structure is unknown. Extensive experiments demonstrate the effectiveness of the proposed methods on various datasets by comparing them with existing methods.
Probabilistic Dimensionality Reduction via Structure Learning
Wang, Li
We propose a novel probabilistic dimensionality reduction framework that can naturally integrate the generative model and the locality information of data. Based on this framework, we present a new model, which is able to learn a smooth skeleton of embedding points in a low-dimensional space from high-dimensional noisy data. The formulation of the new model can be equivalently interpreted as two coupled learning problem, i.e., structure learning and the learning of projection matrix. This interpretation motivates the learning of the embedding points that can directly form an explicit graph structure. We develop a new method to learn the embedding points that form a spanning tree, which is further extended to obtain a discriminative and compact feature representation for clustering problems. Unlike traditional clustering methods, we assume that centers of clusters should be close to each other if they are connected in a learned graph, and other cluster centers should be distant. This can greatly facilitate data visualization and scientific discovery in downstream analysis. Extensive experiments are performed that demonstrate that the proposed framework is able to obtain discriminative feature representations, and correctly recover the intrinsic structures of various real-world datasets.
Learning Sparse Confidence-Weighted Classifier on Very High Dimensional Data
Tan, Mingkui (University of Adelaide) | Yan, Yan (University of Technology Sydney) | Wang, Li (University of Illinois at Chicago) | Hengel, Anton Van Den (University of Adelaide) | Tsang, Ivor W. (University of Technology Sydney) | Shi, Qinfeng (Javen) (University of Adelaide)
Confidence-weighted (CW) learning is a successful online learning paradigm which maintains a Gaussian distribution over classifier weights and adopts a covariancematrix to represent the uncertainties of the weight vectors. However, there are two deficiencies in existing full CW learning paradigms, these being the sensitivity to irrelevant features, and the poor scalability to high dimensional data due to the maintenance of the covariance structure. In this paper, we begin by presenting an online-batch CW learning scheme, and then present a novel paradigm to learn sparse CW classifiers. The proposed paradigm essentially identifies feature groups and naturally builds a block diagonal covariance structure, making it very suitable for CW learning over very high-dimensional data.Extensive experimental results demonstrate the superior performance of the proposed methods over state-of-the-art counterparts on classification and feature selection tasks.
A Novel Regularized Principal Graph Learning Framework on Explicit Graph Representation
Mao, Qi, Wang, Li, Tsang, Ivor W., Sun, Yijun
Many scientific datasets are of high dimension, and the analysis usually requires visual manipulation by retaining the most important structures of data. Principal curve is a widely used approach for this purpose. However, many existing methods work only for data with structures that are not self-intersected, which is quite restrictive for real applications. A few methods can overcome the above problem, but they either require complicated human-made rules for a specific task with lack of convergence guarantee and adaption flexibility to different tasks, or cannot obtain explicit structures of data. To address these issues, we develop a new regularized principal graph learning framework that captures the local information of the underlying graph structure based on reversed graph embedding. As showcases, models that can learn a spanning tree or a weighted undirected $\ell_1$ graph are proposed, and a new learning algorithm is developed that learns a set of principal points and a graph structure from data, simultaneously. The new algorithm is simple with guaranteed convergence. We then extend the proposed framework to deal with large-scale data. Experimental results on various synthetic and six real world datasets show that the proposed method compares favorably with baselines and can uncover the underlying structure correctly.
Matching Pursuit LASSO Part II: Applications and Sparse Recovery over Batch Signals
Tan, Mingkui, Tsang, Ivor W., Wang, Li
Matching Pursuit LASSIn Part I \cite{TanPMLPart1}, a Matching Pursuit LASSO ({MPL}) algorithm has been presented for solving large-scale sparse recovery (SR) problems. In this paper, we present a subspace search to further improve the performance of MPL, and then continue to address another major challenge of SR -- batch SR with many signals, a consideration which is absent from most of previous $\ell_1$-norm methods. As a result, a batch-mode {MPL} is developed to vastly speed up sparse recovery of many signals simultaneously. Comprehensive numerical experiments on compressive sensing and face recognition tasks demonstrate the superior performance of MPL and BMPL over other methods considered in this paper, in terms of sparse recovery ability and efficiency. In particular, BMPL is up to 400 times faster than existing $\ell_1$-norm methods considered to be state-of-the-art.O Part II: Applications and Sparse Recovery over Batch Signals
Convex Matching Pursuit for Large-Scale Sparse Coding and Subset Selection
Tan, Mingkui (Nanyang Technological University) | Tsang, Ivor W. (Nanyang Technological University) | Wang, Li (University of California) | Zhang, Xinming (Nanyang Technological University)
In this paper, a new convex matching pursuit scheme is proposed for tackling large-scale sparse coding and subset selection problems. In contrast with current matching pursuit algorithms such as subspace pursuit (SP), the proposed algorithm has a convex formulation and guarantees that the objective value can be monotonically decreased. Moreover, theoretical analysis and experimental results show that the proposed method achieves better scalability while maintaining similar or better decoding ability compared with state-of-the-art methods on large-scale problems.