Goto

Collaborating Authors

 weisfeiler-lehman



Learning subtree pattern importance for Weisfeiler-Lehmanbased graph kernels

arXiv.org Artificial Intelligence

Graph is an usual representation of relational data, which are ubiquitous in manydomains such as molecules, biological and social networks. A popular approach to learningwith graph structured data is to make use of graph kernels, which measure the similaritybetween graphs and are plugged into a kernel machine such as a support vector machine.Weisfeiler-Lehman (WL) based graph kernels, which employ WL labeling scheme to extract subtree patterns and perform node embedding, are demonstrated to achieve great performance while being efficiently computable. However, one of the main drawbacks of ageneral kernel is the decoupling of kernel construction and learning process. For moleculargraphs, usual kernels such as WL subtree, based on substructures of the molecules, consider all available substructures having the same importance, which might not be suitable inpractice. In this paper, we propose a method to learn the weights of subtree patterns in the framework of WWL kernels, the state of the art method for graph classification task [14]. To overcome the computational issue on large scale data sets, we present an efficient learning algorithm and also derive a generalization gap bound to show its convergence. Finally, through experiments on synthetic and real-world data sets, we demonstrate the effectiveness of our proposed method for learning the weights of subtree patterns.


Graph Kernels: State-of-the-Art and Future Challenges

arXiv.org Machine Learning

Among the data structures commonly used in machine learning, graphs are arguably one of the most general. Graphs allow modelling complex objects as a collection of entities (nodes) and of relationships between such entities (edges), each of which can be annotated by metadata such as categorical or vectorial node and edge features. Many ubiquitous data types can be understood as particular cases of graphs, including unstructured vectorial data as well as structured data types such as time series, images, volumetric data, point clouds or bags of entities, to name a few. Most importantly, numerous applications benefit from the extra flexibility that graph-based representations provide. In chemoinformatics, graphs have been used extensively to represent molecular compounds (Trinajstic, 2018), with nodes corresponding to atoms, edges to chemical bonds, and node and edge features encoding known chemical properties of each atom and bond in the molecule. Machine learning approaches operating on such graph-based representations of molecules are becoming increasingly successful in learning to predict complex molecular properties from large annotated data sets (Duvenaud et al., 2015; Gilmer et al., 2017; Wu et al., 2018), offering a promising set of tools for drug discovery (Vamathevan et al., 2019). In computational biology, graphs have likewise risen to prominence due to their ability to describe multifaceted interactions between (biological) entities.


Wasserstein Weisfeiler-Lehman Graph Kernels

arXiv.org Machine Learning

Graph kernels are an instance of the class of $\mathcal{R}$-Convolution kernels, which measure the similarity of objects by comparing their substructures. Despite their empirical success, most graph kernels use a naive aggregation of the final set of substructures, usually a sum or average, thereby potentially discarding valuable information about the distribution of individual components. Furthermore, only a limited instance of these approaches can be extended to continuously attributed graphs. We propose a novel method that relies on the Wasserstein distance between the node feature vector distributions of two graphs, which allows to find subtler differences in data sets by considering graphs as high-dimensional objects, rather than simple means. We further propose a Weisfeiler-Lehman inspired embedding scheme for graphs with continuous node attributes and weighted edges, enhance it with the computed Wasserstein distance, and thus improve the state-of-the-art prediction performance on several graph classification tasks.


Dynamics Based Features For Graph Classification

arXiv.org Machine Learning

Numerous social, medical, engineering and biological challenges can be framed as graph-based learning tasks. Here, we propose a new feature based approach to network classification. We show how dynamics on a network can be useful to reveal patterns about the organization of the components of the underlying graph where the process takes place. We define generalized assortativities on networks and use them as generalized features across multiple time scales. These features turn out to be suitable signatures for discriminating between different classes of networks. Our method is evaluated empirically on established network benchmarks. We also introduce a new dataset of human brain networks (connectomes) and use it to evaluate our method. Results reveal that our dynamics based features are competitive and often outperform state of the art accuracies.