graphon
Nonparametric estimation of time-varying network connections by multi-stage smoothing
Lee, Jeonghwan, Li, Tianxi, Rothman, Adam J.
Time-varying networks arise in a variety of ubiquitous applications, such as functional brain connectivity [Thompson et al., 2017, Zhang et al., 2020], gene and genomic regulatory processes [Zhang and Cao, 2017, Bartlett et al., 2021], and social or economic environments [Snijders et al., 2010, Kolar et al., 2010]. In these contexts, measurements collected at different time points record how observed connections fluctuate, forming a sequence of network snapshots that reflect the temporal evolution of the underlying system. For example, fMRI studies yield time-indexed measurements of activity across brain regions, from which researchers construct connectivity networks that change over the scanning period [Bassett et al., 2011, Rubinov and Sporns, 2010]. Similarly, in political systems such as the U.S. Senate, legislative cosponsorship records give rise to network snapshots that naturally vary across sessions [Fowler, 2006, Kirkland and Gross, 2014]. General reviews of time-varying network analysis, including methodological developments and representative applications, are provided in Holme and Saram aki [2012] and Kim et al. [2018].
A graphon-signal analysis of graph neural networks
We present an approach for analyzing message passing graph neural networks (MPNNs) based on an extension of graphon analysis to a so called graphon-signal analysis. AMPNN is a function that takes a graph and a signal on the graph (a graph-signal) and returns some value. Since the input space of MPNNs is non-Euclidean, i.e., graphs can be of any size and topology, properties such as generalization are less well understood for MPNNs than for Euclidean neural networks. We claim that one important missing ingredient in past work is a meaningful notion of graph-signal similarity measure, that endows the space of inputs to MPNNs with a regular structure. We present such a similarity measure, called the graphon-signal cut distance, which makes the space of all graph-signals a dense subset of a compact metric space - the graphon-signal space.
Graphons, mergeons, and so on!
In this work we develop a theory of hierarchical clustering for graphs. Our modelling assumption is that graphs are sampled from a graphon, which is a powerful and general model for generating graphs and analyzing large networks. Graphons are a far richer class of graph models than stochastic blockmodels, the primary setting for recent progress in the statistical theory of graph clustering. We define what it means for an algorithm to produce the ``correct clustering, give sufficient conditions in which a method is statistically consistent, and provide an explicit algorithm satisfying these properties.
Low-Complexity and Consistent Graphon Estimation from Multiple Networks
Sogan, Roland Boniface, Rebafka, Tabea
Recovering the random graph model from an observed collection of networks is known to present significant challenges in the setting, where the networks do not share a common node set and have different sizes. More specifically, the goal is the estimation of the graphon function that parametrizes the nonparametric exchangeable random graph model. Existing methods typically suffer from either limited accuracy or high computational complexity. We introduce a new histogram-based estimator with low algorithmic complexity that achieves high accuracy by jointly aligning the nodes of all graphs, in contrast to most conventional methods that order nodes graph by graph. Consistency results of the proposed graphon estimator are established. A numerical study shows that the proposed estimator outperforms existing methods in terms of accuracy, especially when the dataset comprises only small and variable-size networks. Moreover, the computing time of the new method is considerably shorter than that of other consistent methodologies. Additionally, when applied to a graph neural network classification task, the proposed estimator enables more effective data augmentation, yielding improved performance across diverse real-world datasets.