Goto

Collaborating Authors

 structural node embedding


Structural Node Embeddings with Homomorphism Counts

arXiv.org Artificial Intelligence

Graph homomorphism counts, first explored by Lov\'asz in 1967, have recently garnered interest as a powerful tool in graph-based machine learning. Grohe (PODS 2020) proposed the theoretical foundations for using homomorphism counts in machine learning on graph level as well as node level tasks. By their very nature, these capture local structural information, which enables the creation of robust structural embeddings. While a first approach for graph level tasks has been made by Nguyen and Maehara (ICML 2020), we experimentally show the effectiveness of homomorphism count based node embeddings. Enriched with node labels, node weights, and edge weights, these offer an interpretable representation of graph data, allowing for enhanced explainability of machine learning models. We propose a theoretical framework for isomorphism-invariant homomorphism count based embeddings which lend themselves to a wide variety of downstream tasks. Our approach capitalises on the efficient computability of graph homomorphism counts for bounded treewidth graph classes, rendering it a practical solution for real-world applications. We demonstrate their expressivity through experiments on benchmark datasets. Although our results do not match the accuracy of state-of-the-art neural architectures, they are comparable to other advanced graph learning models. Remarkably, our approach demarcates itself by ensuring explainability for each individual feature. By integrating interpretable machine learning algorithms like SVMs or Random Forests, we establish a seamless, end-to-end explainable pipeline. Our study contributes to the advancement of graph-based techniques that offer both performance and interpretability.


Digraphwave: Scalable Extraction of Structural Node Embeddings via Diffusion on Directed Graphs

arXiv.org Artificial Intelligence

Structural node embeddings, vectors capturing local connectivity information for each node in a graph, have many applications in data mining and machine learning, e.g., network alignment and node classification, clustering and anomaly detection. For the analysis of directed graphs, e.g., transactions graphs, communication networks and social networks, the capability to capture directional information in the structural node embeddings is highly desirable, as is scalability of the embedding extraction method. Most existing methods are nevertheless only designed for undirected graph. Therefore, we present Digraphwave -- a scalable algorithm for extracting structural node embeddings on directed graphs. The Digraphwave embeddings consist of compressed diffusion pattern signatures, which are twice enhanced to increase their discriminate capacity. By proving a lower bound on the heat contained in the local vicinity of a diffusion initialization node, theoretically justified diffusion timescale values are established, and Digraphwave is left with only two easy-to-interpret hyperparameters: the embedding dimension and a neighbourhood resolution specifier. In our experiments, the two embedding enhancements, named transposition and aggregation, are shown to lead to a significant increase in macro F1 score for classifying automorphic identities, with Digraphwave outperforming all other structural embedding baselines. Moreover, Digraphwave either outperforms or matches the performance of all baselines on real graph datasets, displaying a particularly large performance gain in a network alignment task, while also being scalable to graphs with millions of nodes and edges, running up to 30x faster than a previous diffusion pattern based method and with a fraction of the memory consumption.