Goto

Collaborating Authors

 Huang, Wenbing


The Truly Deep Graph Convolutional Networks for Node Classification

arXiv.org Machine Learning

Existing Graph Convolutional Networks (GCNs) are shallow---the number of the layers is usually not larger than 2. The deeper variants by simply stacking more layers, unfortunately perform worse, even involving well-known tricks like weight penalizing, dropout, and residual connections. This paper reveals that developing deep GCNs mainly encounters two obstacles: \emph{over-fitting} and \emph{over-smoothing}. The over-fitting issue weakens the generalization ability on small graphs, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. Hence, we propose DropEdge, a novel technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graphs, acting like a data augmenter and also a message passing reducer. More importantly, DropEdge enables us to recast a wider range of Convolutional Neural Networks (CNNs) from the image field to the graph domain; in particular, we study DenseNet and InceptionNet in this paper. Extensive experiments on several benchmarks demonstrate that our method allows deep GCNs to achieve promising performance, even when the number of layers exceeds 30---the deepest GCN that has ever been proposed.


Label Aware Graph Convolutional Network -- Not All Edges Deserve Your Attention

arXiv.org Machine Learning

Graph classification is practically important in many domains. To solve this problem, one usually calculates a low-dimensional representation for each node in the graph with supervised or unsupervised approaches. Most existing approaches consider all the edges between nodes while overlooking whether the edge will brings positive or negative influence to the node representation learning. In many real-world applications, however, some connections among the nodes can be noisy for graph convolution, and not all the edges deserve your attention. In this work, we distinguish the positive and negative impacts of the neighbors to the node in graph node classification, and propose to enhance the graph convolutional network by considering the labels between the neighbor edges. We present a novel GCN framework, called Label-aware Graph Convolutional Network (LAGCN), which incorporates the supervised and unsupervised learning by introducing the edge label predictor. As a general model, LAGCN can be easily adapted in various previous GCN and enhance their performance with some theoretical guarantees. Experimental results on multiple real-world datasets show that LAGCN is competitive against various state-of-the-art methods in graph classification.


Adversarial Representation Learning on Large-Scale Bipartite Graphs

arXiv.org Machine Learning

Graph representation on large-scale bipartite graphs is central for a variety of applications, ranging from social network analysis to recommendation system development. Existing methods exhibit two key drawbacks: 1. unable to characterize the inconsistency of the node features within the bipartite-specific structure; 2. unfriendly to support large-scale bipartite graphs. To this end, we propose ABCGraph, a scalable model for unsupervised learning on large-scale bipartite graphs. At its heart, ABCGraph utilizes the proposed Bipartite Graph Convolutional Network (BGCN) as the encoder and adversarial learning as the training loss to learn representations from nodes in two different domains and bipartite structures, in an unsupervised manner. Moreover, we devise a cascaded architecture to capture the multi-hop relationship in bipartite structure and improves the scalability as well. Extensive experiments on multiple datasets of varying scales verify the effectiveness of ABCGraph compared to state-of-the-arts. For the experiment on a real-world large-scale bipartite graph system, fast training speed and low memory cost demonstrate the scalability of ABCGraph model.


Weakly Supervised Dense Event Captioning in Videos

Neural Information Processing Systems

Dense event captioning aims to detect and describe all events of interest contained in a video. Despite the advanced development in this area, existing methods tackle this task by making use of dense temporal annotations, which is dramatically source-consuming. This paper formulates a new problem: weakly supervised dense event captioning, which does not require temporal segment annotations for model training. Our solution is based on the one-to-one correspondence assumption, each caption describes one temporal segment, and each temporal segment has one caption, which holds in current benchmark datasets and most real-world cases. We decompose the problem into a pair of dual problems: event captioning and sentence localization and present a cycle system to train our model. Extensive experimental results are provided to demonstrate the ability of our model on both dense event captioning and sentence localization in videos.


Weakly Supervised Dense Event Captioning in Videos

Neural Information Processing Systems

Dense event captioning aims to detect and describe all events of interest contained in a video. Despite the advanced development in this area, existing methods tackle this task by making use of dense temporal annotations, which is dramatically source-consuming. This paper formulates a new problem: weakly supervised dense event captioning, which does not require temporal segment annotations for model training. Our solution is based on the one-to-one correspondence assumption, each caption describes one temporal segment, and each temporal segment has one caption, which holds in current benchmark datasets and most real-world cases. We decompose the problem into a pair of dual problems: event captioning and sentence localization and present a cycle system to train our model. Extensive experimental results are provided to demonstrate the ability of our model on both dense event captioning and sentence localization in videos.


Adaptive Sampling Towards Fast Graph Representation Learning

Neural Information Processing Systems

Graph Convolutional Networks (GCNs) have become a crucial tool on learning representations of graph vertices. The main challenge of adapting GCNs on large-scale graphs is the scalability issue that it incurs heavy cost both in computation and memory due to the uncontrollable neighborhood expansion across layers. In this paper, we accelerate the training of GCNs through developing an adaptive layer-wise sampling method. By constructing the network layer by layer in a top-down passway, we sample the lower layer conditioned on the top one, where the sampled neighborhoods are shared by different parent nodes and the over expansion is avoided owing to the fixed-size sampling. More importantly, the proposed sampler is adaptive and applicable for explicit variance reduction, which in turn enhances the training of our method. Furthermore, we propose a novel and economical approach to promote the message passing over distant nodes by applying skip connections. Intensive experiments on several benchmarks verify the effectiveness of our method regarding the classification accuracy while enjoying faster convergence speed.


Efficient Optimization for Linear Dynamical Systems with Applications to Clustering and Sparse Coding

Neural Information Processing Systems

Linear Dynamical Systems (LDSs) are fundamental tools for modeling spatio-temporal data in various disciplines. Though rich in modeling, analyzing LDSs is not free of difficulty, mainly because LDSs do not comply with Euclidean geometry and hence conventional learning techniques can not be applied directly. In this paper, we propose an efficient projected gradient descent method to minimize a general form of a loss function and demonstrate how clustering and sparse coding with LDSs can be solved by the proposed method efficiently. To this end, we first derive a novel canonical form for representing the parameters of an LDS, and then show how gradient-descent updates through the projection on the space of LDSs can be achieved dexterously. In contrast to previous studies, our solution avoids any approximation in LDS modeling or during the optimization process. Extensive experiments reveal the superior performance of the proposed method in terms of the convergence and classification accuracy over state-of-the-art techniques.


Efficient Spatio-Temporal Tactile Object Recognition with Randomized Tiling Convolutional Networks in a Hierarchical Fusion Strategy

AAAI Conferences

Robotic tactile recognition aims at identifying target objects or environments from tactile sensory readings. The advancement of unsupervised feature learning and biological tactile sensing inspire us proposing the model of 3T-RTCN that performs spatio-temporal feature representation and fusion for tactile recognition. It decomposes tactile data into spatial and temporal threads, and incorporates the strength of randomized tiling convolutional networks. Experimental evaluations show that it outperforms some state-of-the-art methods with a large margin regarding recognition accuracy, robustness, and fault-tolerance; we also achieve an order-of-magnitude speedup over equivalent networks with pretraining and finetuning. Practical suggestions and hints are summarized in the end for effectively handling the tactile data.