The Hong Kong Polytechnic University
Deeper Insights Into Graph Convolutional Networks for Semi-Supervised Learning
Li, Qimai (The Hong Kong Polytechnic University) | Han, Zhichao (ETH Zurich, The Hong Kong Polytechnic University) | Wu, Xiao-ming (The Hong Kong Polytechnic University)
Many interesting problems in machine learning are being revisited with new deep learning tools. For graph-based semi-supervised learning, a recent important development is graph convolutional networks (GCNs), which nicely integrate local vertex features and graph topology in the convolutional layers. Although the GCN model compares favorably with other state-of-the-art methods, its mechanisms are not clear and it still requires considerable amount of labeled data for validation and model selection. In this paper, we develop deeper insights into the GCN model and address its fundamental limits. First, we show that the graph convolution of the GCN model is actually a special form of Laplacian smoothing, which is the key reason why GCNs work, but it also brings potential concerns of over-smoothing with many convolutional layers. Second, to overcome the limits of the GCN model with shallow architectures, we propose both co-training and self-training approaches to train GCNs. Our approaches significantly improve GCNs in learning with very few labels, and exempt them from requiring additional labels for validation. Extensive experiments on benchmarks have verified our theory and proposals.
Deep Representation-Decoupling Neural Networks for Monaural Music Mixture Separation
Li, Zhuo (The Hong Kong Polytechnic University) | Wang, Hongwei (Shanghai Jiao Tong University) | Zhao, Miao (The Hong Kong Polytechnic University) | Li, Wenjie (The Hong Kong Polytechnic University) | Guo, Minyi (Shanghai Jiao Tong University)
Monaural source separation (MSS) aims to extract and reconstruct different sources from a single-channel mixture, which could facilitate a variety of applications such as chord recognition, pitch estimation and automatic transcription. In this paper, we study the problem of separating vocals and instruments from monaural music mixture. Existing works for monaural source separation either utilize linear and shallow models (e.g., non-negative matrix factorization), or do not explicitly address the coupling and tangling of multiple sources in original input signals, hence they do not perform satisfactorily in real-world scenarios. To overcome the above limitations, we propose a novel end-to-end framework for monaural music mixture separation called Deep Representation-Decoupling Neural Networks (DRDNN). DRDNN takes advantages of both traditional signal processing methods and popular deep learning models. For each input of music mixture, DRDNN converts it to a two-dimensional time-frequency spectrogram using short-time Fourier transform (STFT), followed by stacked convolutional neural networks (CNN) layers and long-short term memory (LSTM) layers to extract more condensed features. Afterwards, DRDNN utilizes a decoupling component, which consists of a group of multi-layer perceptrons (MLP), to decouple the features further into different separated sources. The design of decoupling component in DRDNN produces purified single-source signals for subsequent full-size restoration, and can significantly improve the performance of final separation. Through extensive experiments on real-world dataset, we prove that DRDNN outperforms state-of-the-art baselines in the task of monaural music mixture separation and reconstruction.
Modeling Scientific Influence for Research Trending Topic Prediction
Chen, Chengyao (The Hong Kong Polytechnic University) | Wang, Zhitao (The Hong Kong Polytechnic University) | Li, Wenjie (The Hong Kong Polytechnic University) | Sun, Xu (Peking University)
With the growing volume of publications in the Computer Science (CS) discipline, tracking the research evolution and predicting the future research trending topics are of great importance for researchers to keep up with the rapid progress of research. Within a research area, there are many top conferences that publish the latest research results. These conferences mutually influence each other and jointly promote the development of the research area. To predict the trending topics of mutually influenced conferences, we propose a correlated neural influence model, which has the ability to capture the sequential properties of research evolution in each individual conference and discover the dependencies among different conferences simultaneously. The experiments conducted on a scientific dataset including conferences in artificial intelligence and data mining show that our model consistently outperforms the other state-of-the-art methods. We also demonstrate the interpretability and predictability of the proposed model by providing its answers to two questions of concern, i.e., what the next rising trending topics are and for each conference who the most influential peer is.
A Probabilistic Hierarchical Model for Multi-View and Multi-Feature Classification
Li, Jinxing (The Hong Kong Polytechnic University) | Yong, Hongwei (The Hong Kong Polytechnic University) | Zhang, Bob ( University of Macau ) | Li, Mu (The Hong Kong Polytechnic University) | Zhang, Lei (The Hong Kong Polytechnic University) | Zhang, David (The Hong Kong Polytechnic University)
Some recent works in classification show that the data obtained from various views with different sensors for an object contributes to achieving a remarkable performance. Actually, in many real-world applications, each view often contains multiple features, which means that this type of data has a hierarchical structure, while most of existing works do not take these features with multi-layer structure into consideration simultaneously. In this paper, a probabilistic hierarchical model is proposed to address this issue and applied for classification. In our model, a latent variable is first learned to fuse the multiple features obtained from a same view, sensor or modality. Particularly, mapping matrices corresponding to a certain view are estimated to project the latent variable from a shared space to the multiple observations. Since this method is designed for the supervised purpose, we assume that the latent variables associated with different views are influenced by their ground-truth label. In order to effectively solve the proposed method, the Expectation-Maximization (EM) algorithm is applied to estimate the parameters and latent variables. Experimental results on the extensive synthetic and two real-world datasets substantiate the effectiveness and superiority of our approach as compared with state-of-the-art.
Faithful to the Original: Fact Aware Neural Abstractive Summarization
Cao, Ziqiang (The Hong Kong Polytechnic University) | Wei, Furu (Microsoft Research Asia) | Li, Wenjie (The Hong Kong Polytechnic University) | Li, Sujian (Peking University)
Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.
SFCN-OPI: Detection and Fine-Grained Classification of Nuclei Using Sibling FCN With Objectness Prior Interaction
Zhou, Yanning (The Chinese University of Hong Kong) | Dou, Qi (The Chinese University of Hong Kong) | Chen, Hao ( The Chinese University of Hong Kong ) | Qin, Jing (The Hong Kong Polytechnic University) | Heng, Pheng-Ann ( The Chinese University of Hong Kong )
Cell nuclei detection and fine-grained classification have been fundamental yet challenging problems in histopathology image analysis. Due to the nuclei tiny size, significant inter-/intra-class variances, as well as the inferior image quality, previous automated methods would easily suffer from limited accuracy and robustness. In the meanwhile, existing approaches usually deal with these two tasks independently, which would neglect the close relatedness of them. In this paper, we present a novel method of sibling fully convolutional network with prior objectness interaction (called SFCN-OPI) to tackle the two tasks simultaneously and interactively using a unified end-to-end framework. Specifically, the sibling FCN branches share features in earlier layers while holding respective higher layers for specific tasks. More importantly, the detection branch outputs the objectness prior which dynamically interacts with the fine-grained classification sibling branch during the training and testing processes. With this mechanism, the fine-grained classification successfully focuses on regions with high confidence of nuclei existence and outputs the conditional probability, which in turn benefits the detection through back propagation. Extensive experiments on colon cancer histology images have validated the effectiveness of our proposed SFCN-OPI and our method has outperformed the state-of-the-art methods by a large margin.
Adversarial Network Embedding
Dai, Quanyu (The Hong Kong Polytechnic University) | Li, Qiang (The Hong Kong Polytechnic University) | Tang, Jian (HEC Montreal,ย Montreal Institute for Learning Algorithms) | Wang, Dan (The Hong Kong Polytechnic University)
Learning low-dimensional representations of networks has proved effective in a variety of tasks such as node classification, link prediction and network visualization. Existing methods can effectively encode different structural properties into the representations, such as neighborhood connectivity patterns, global structural role similarities and other high-order proximities. However, except for objectives to capture network structural properties, most of them suffer from lack of additional constraints for enhancing the robustness of representations. In this paper, we aim to exploit the strengths of generative adversarial networks in capturing latent features, and investigate its contribution in learning stable and robust graph representations. Specifically, we propose an Adversarial Network Embedding (ANE) framework, which leverages the adversarial learning principle to regularize the representation learning. It consists of two components, i.e., a structure preserving component and an adversarial learning component. The former component aims to capture network structural properties, while the latter contributes to learning robust representations by matching the posterior distribution of the latent representations to given priors. As shown by the empirical results, our method is competitive with or superior to state-of-the-art approaches on benchmark network embedding tasks.
Dialogue Generation With GAN
Su, Hui (The Hong Kong Polytechnic University) | Shen, Xiaoyu (Max Planck Institute Informatics) | Hu, Pengwei (The Hong Kong Polytechnic University) | Li, Wenjie (The Hong Kong Polytechnic University) | Chen, Yun ( The University of Hong Kong )
This paper presents a Generative Adversarial Network (GAN) to model multiturn dialogue generation, which trains a latent hierarchical recurrent encoder-decoder simultaneously with a discriminative classifier that make the prior approximate to the posterior. Experiments show that our model achieves better results.
GraphGAN: Graph Representation Learning With Generative Adversarial Nets
Wang, Hongwei (Shanghai Jiao Tong University) | Wang, Jia (The Hong Kong Polytechnic University) | Wang, Jialin (Huazhong University of Science and Technology) | Zhao, Miao (The Hong Kong Polytechnic University) | Zhang, Weinan (Shanghai Jiao Tong University) | Zhang, Fuzheng (Microsoft Research Asia) | Xie, Xing (Microsoft Research Asia) | Guo, Minyi (Shanghai Jiao Tong University)
The goal of graph representation learning is to embed each vertex in a graph into a low-dimensional vector space. Existing graph representation learning methods can be classified into two categories: generative models that learn the underlying connectivity distribution in the graph, and discriminative models that predict the probability of edge existence between a pair of vertices. In this paper, we propose GraphGAN, an innovative graph representation learning framework unifying above two classes of methods, in which the generative model and discriminative model play a game-theoretical minimax game. Specifically, for a given vertex, the generative model tries to fit its underlying true connectivity distribution over all other vertices and produces "fake" samples to fool the discriminative model, while the discriminative model tries to detect whether the sampled vertex is from ground truth or generated by the generative model. With the competition between these two models, both of them can alternately and iteratively boost their performance. Moreover, when considering the implementation of generative model, we propose a novel graph softmax to overcome the limitations of traditional softmax function, which can be proven satisfying desirable properties of normalization, graph structure awareness, and computational efficiency. Through extensive experiments on real-world datasets, we demonstrate that GraphGAN achieves substantial gains in a variety of applications, including link prediction, node classification, and recommendation, over state-of-the-art baselines.
Learning a Wavelet-Like Auto-Encoder to Accelerate Deep Neural Networks
Chen, Tianshui (Sun Yat-sen University) | Lin, Liang (Sun Yat-sen University) | Zuo, Wangmeng (Harbin Institute of Technology) | Luo, Xiaonan (Guilin University of Electronic Technology) | Zhang, Lei (The Hong Kong Polytechnic University)
Accelerating deep neural networks (DNNs) has been attracting increasing attention as it can benefit a wide range of applications, e.g., enabling mobile systems with limited computing resources to own powerful visual recognition ability. A practical strategy to this goal usually relies on a two-stage process: operating on the trained DNNs (e.g., approximating the convolutional filters with tensor decomposition) and fine-tuning the amended network, leading to difficulty in balancing the trade-off between acceleration and maintaining recognition performance. In this work, aiming at a general and comprehensive way for neural network acceleration, we develop a Wavelet-like Auto-Encoder (WAE) that decomposes the original input image into two low-resolution channels (sub-images) and incorporate the WAE into the classification neural networks for joint training. The two decomposed channels, in particular, are encoded to carry the low-frequency information (e.g., image profiles) and high-frequency (e.g., image details or noises), respectively, and enable reconstructing the original input image through the decoding process. Then, we feed the low-frequency channel into a standard classification network such as VGG or ResNet and employ a very lightweight network to fuse with the high-frequency channel to obtain the classification result. Compared to existing DNN acceleration solutions, our framework has the following advantages: i) it is tolerant to any existing convolutional neural networks for classification without amending their structures; ii) the WAE provides an interpretable way to preserve the main components of the input image for classification.