homophily
Non-Euclidean Mixture Model for Social Network Embedding
It is largely agreed that social network links are formed due to either homophily or social influence. Inspired by this, we aim at understanding the generation of links via providing a novel embedding-based graph formation model. Different from existing graph representation learning, where link generation probabilities are defined as a simple function of the corresponding node embeddings, we model the link generation as a mixture model of the two factors. In addition, we model the homophily factor in spherical space and the influence factor in hyperbolic space to accommodate the fact that (1) homophily results in cycles and (2) influence results in hierarchies in networks. We also design a special projection to align these two spaces. We call this model Non-Euclidean Mixture Model, i.e., NMM.
2 Preliminary. We use A E to denote the existence of an edge between node u and v, otherwise A
Graph homophily refers to the phenomenon that connected nodes tend to share similar characteristics. Understanding this concept and its related metrics is crucial for designing effective Graph Neural Networks (GNNs). The most widely used homophily metrics, such as edge or node homophily, quantify such "similarity" as label consistency across the graph topology. These metrics are believed to be able to reflect the performance of GNNs, especially on node-level tasks. However, many recent studies have empirically demonstrated that the performance of GNNs does not always align with homophily metrics, and how homophily influences GNNs still remains unclear and controversial.
5975754c7650dfee0682e06e1fec0522-Supplemental-Conference.pdf
Appendix for "What makes graph neural networks miscalibrated?" We report the homophily index proposed by Pei et al. [16], which provides a global view of the neighborhood similarity for a graph. Given a graph G = (V, E), the homophily is defined as H(G) = 1 Number of node i's neighbors who have the same label as i . A.2 Details of model training setup We follow the setting of Shchur et al. [20] to define GCN [7] and GAT [22] models. Both models consist of 2 layers and the hidden dimension is fixed to 64.
R3, R4] find the paper clear; most reviewers [R2, R3, R4] find the problem of overcoming the implicit homophily
We thank the reviewers for their thoughtful and constructive feedback. While we only address major discussion points here, we will incorporate all feedback in the final version. We'll revise our paper to clarify the scope of our contributions. While we agree that the designs are not new, our analysis for the heterophily setting is novel. We'll make this more clear.
Contrastive Laplacian Eigenmaps (Supplementary Material) Ke Sun
In this section, we [ perform the transition of Eq. (4) into Eq.(5). We note that Eq. (4) relies on two terms: E Combining Eq. (16) and (17) gives Eq. (5). Based on Eq. (15), we can extend Eq. (11) into two different items: ลด For brevity, we omit b in the above result, whose role in Eq. (11) is to normalize by the block size e.g., the number of links between v and u (and some b Based on the above derivations, Eq. (11) can be reformulated as: ( Looking at Eq. (8), we notice that for the SampledNCE with sigmoid, one can think of D(x) and 1 D(x To this end, we make a simple assumption. To validate our intuition, Figure 2 shows Acc. between COLES-GCN and GCN+SampledNCE as a function of H = H (G We use the same experimental setting as the one used for results reported in Table 2. After sorting results by homophily in the ascending order, we note that the overall trend agrees with our expectations that for small H, both methods struggle more as it is harder for the contrastive setting to find distinctive embeddings.
Characterizing Graph Datasets for Node Classification: Homophily-Heterophily Dichotomy and Beyond Denis Kuznedelev HSE University
Homophily is a graph property describing the tendency of edges to connect similar nodes; the opposite is called heterophily. It is often believed that heterophilous graphs are challenging for standard message-passing graph neural networks (GNNs), and much effort has been put into developing efficient methods for this setting. However, there is no universally agreed-upon measure of homophily in the literature. In this work, we show that commonly used homophily measures have critical drawbacks preventing the comparison of homophily levels across different datasets. For this, we formalize desirable properties for a proper homophily measure and verify which measures satisfy which properties.