Collaborating Authors

Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification Machine Learning

Graph neural network (GNN) and label propagation algorithm (LPA) are both message passing algorithms, which have achieved superior performance in semi-supervised classification. GNN performs feature propagation by a neural network to make predictions, while LPA uses label propagation across graph adjacency matrix to get results. However, there is still no good way to combine these two kinds of algorithms. In this paper, we proposed a new Unified Message Passaging Model (UniMP) that can incorporate feature propagation and label propagation with a shared message passing network, providing a better performance in semi-supervised classification. First, we adopt a Graph Transformer jointly label embedding to propagate both the feature and label information. Second, to train UniMP without overfitting in self-loop label information, we propose a masked label prediction strategy, in which some percentage of training labels are simply masked at random, and then predicted. UniMP conceptually unifies feature propagation and label propagation and be empirically powerful. It obtains new state-of-the-art semi-supervised classification results in Open Graph Benchmark (OGB). Our implementation is available online

Non-iterative Label Propagation in Optimal Leading Forest Artificial Intelligence

Graph based semi-supervised learning (GSSL) has intuitive representation and can be improved by exploiting the matrix calculation. However, it has to perform iterative optimization to achieve a preset objective, which usually leads to low efficiency. Another inconvenience lying in GSSL is that when new data come, the graph construction and the optimization have to be conducted all over again. We propose a sound assumption, arguing that: the neighboring data points are not in peer-to-peer relation, but in a partial-ordered relation induced by the local density and distance between the data; and the label of a center can be regarded as the contribution of its followers. Starting from the assumption, we develop a highly efficient non-iterative label propagation algorithm based on a novel data structure named as optimal leading forest (LaPOLeaF). The major weaknesses of the traditional GSSL are addressed by this study. We further scale LaPOLeaF to accommodate big data by utilizing block distance matrix technique, parallel computing, and Locality-Sensitive Hashing (LSH). Experiments on large datasets have shown the promising results of the proposed methods.

Wasserstein Soft Label Propagation on Hypergraphs: Algorithm and Generalization Error Bounds Machine Learning

Inspired by recent interests of developing machine learning and data mining algorithms on hypergraphs, we investigate in this paper the semi-supervised learning algorithm of propagating "soft labels" (e.g. probability distributions, class membership scores) over hypergraphs, by means of optimal transportation. Borrowing insights from Wasserstein propagation on graphs [Solomon et al. 2014], we re-formulate the label propagation procedure as a message-passing algorithm, which renders itself naturally to a generalization applicable to hypergraphs through Wasserstein barycenters. Furthermore, in a PAC learning framework, we provide generalization error bounds for propagating one-dimensional distributions on graphs and hypergraphs using 2-Wasserstein distance, by establishing the \textit{algorithmic stability} of the proposed semi-supervised learning algorithm. These theoretical results also shed new lights upon deeper understandings of the Wasserstein propagation on graphs.

Unifying Graph Convolutional Neural Networks and Label Propagation Machine Learning

Label Propagation (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relation between LPA and GCN has not yet been investigated. Here we study the relationship between LPA and GCN in terms of two aspects: (1) feature/label smoothing where we analyze how the feature/label of one node is spread over its neighbors; And, (2) feature/label influence of how much the initial feature/label of one node influences the final feature/label of another node. Based on our theoretical analysis, we propose an end-to-end model that unifies GCN and LPA for node classification. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved classification performance. Our model can also be seen as learning attention weights based on node labels, which is more task-oriented than existing feature-based attention models. In a number of experiments on real-world graphs, our model shows superiority over state-of-the-art GCN-based methods in terms of node classification accuracy.

Iterative label cleaning for transductive and semi-supervised few-shot learning Artificial Intelligence

Few-shot learning amounts to learning representations and acquiring knowledge such that novel tasks may be solved with both supervision and data being limited. Improved performance is possible by transductive inference, where the entire test set is available concurrently, and semi-supervised learning, where more unlabeled data is available. These problems are closely related because there is little or no adaptation of the representation in novel tasks. Focusing on these two settings, we introduce a new algorithm that leverages the manifold structure of the labeled and unlabeled data distribution to predict pseudo-labels, while balancing over classes and using the loss value distribution of a limited-capacity classifier to select the cleanest labels, iterately improving the quality of pseudo-labels. Our solution sets new state of the art on four benchmark datasets, namely \emph{mini}ImageNet, \emph{tiered}ImageNet, CUB and CIFAR-FS, while being robust over feature space pre-processing and the quantity of available data.