latent community
Hierarchical-Graph-Structured Edge Partition Models for Learning Evolving Community Structure
We propose a novel dynamic network model to capture evolving latent communities within temporal networks. To achieve this, we decompose each observed dynamic edge between vertices using a Poisson-gamma edge partition model, assigning each vertex to one or more latent communities through \emph{nonnegative} vertex-community memberships. Specifically, hierarchical transition kernels are employed to model the interactions between these latent communities in the observed temporal network. A hierarchical graph prior is placed on the transition structure of the latent communities, allowing us to model how they evolve and interact over time. Consequently, our dynamic network enables the inferred community structure to merge, split, and interact with one another, providing a comprehensive understanding of complex network dynamics. Experiments on various real-world network datasets demonstrate that the proposed model not only effectively uncovers interpretable latent structures but also surpasses other state-of-the art dynamic network models in the tasks of link prediction and community detection.
- Asia > China > Guangdong Province (0.04)
- North America > United States > New York > New York County > New York City (0.04)
Refined Graph Encoder Embedding via Self-Training and Latent Community Recovery
Shen, Cencheng, Larson, Jonathan, Trinh, Ha, Priebe, Carey E.
This paper introduces a refined graph encoder embedding method, enhancing the original graph encoder embedding using linear transformation, self-training, and hidden community recovery within observed communities. We provide the theoretical rationale for the refinement procedure, demonstrating how and why our proposed method can effectively identify useful hidden communities via stochastic block models, and how the refinement method leads to improved vertex embedding and better decision boundaries for subsequent vertex classification. The efficacy of our approach is validated through a collection of simulated and real-world graph data.
- North America > United States > Delaware > New Castle County > Newark (0.14)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (3 more...)
Efficient Online Inference for Bayesian Nonparametric Relational Models Dae Il Kim 1, David M. Blei
Stochastic block models characterize observed network relationships via latent community memberships. In large social networks, we expect entities to participate in multiple communities, and the number of communities to grow with the network size. We introduce a new model for these phenomena, the hierarchical Dirichlet process relational model, which allows nodes to have mixed membership in an unbounded set of communities. To allow scalable learning, we derive an online stochastic variational inference algorithm. Focusing on assortative models of undirected networks, we also propose an efficient structured mean field variational bound, and online methods for automatically pruning unused communities. Compared to state-of-the-art online learning methods for parametric relational models, we show significantly improved perplexity and link prediction accuracy for sparse networks with tens of thousands of nodes.
Cascade-based Echo Chamber Detection
Minici, Marco, Cinus, Federico, Monti, Corrado, Bonchi, Francesco, Manco, Giuseppe
Despite echo chambers in social media have been under considerable scrutiny, general models for their detection and analysis are missing. In this work, we aim to fill this gap by proposing a probabilistic generative model that explains social media footprints -- i.e., social network structure and propagations of information -- through a set of latent communities, characterized by a degree of echo-chamber behavior and by an opinion polarity. Specifically, echo chambers are modeled as communities that are permeable to pieces of information with similar ideological polarity, and impermeable to information of opposed leaning: this allows discriminating echo chambers from communities that lack a clear ideological alignment. To learn the model parameters we propose a scalable, stochastic adaptation of the Generalized Expectation Maximization algorithm, that optimizes the joint likelihood of observing social connections and information propagation. Experiments on synthetic data show that our algorithm is able to correctly reconstruct ground-truth latent communities with their degree of echo-chamber behavior and opinion polarity. Experiments on real-world data about polarized social and political debates, such as the Brexit referendum or the COVID-19 vaccine campaign, confirm the effectiveness of our proposal in detecting echo chambers. Finally, we show how our model can improve accuracy in auxiliary predictive tasks, such as stance detection and prediction of future propagations.
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Virginia (0.04)
- (7 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Perfect Spectral Clustering with Discrete Covariates
Hehir, Jonathan, Niu, Xiaoyue, Slavkovic, Aleksandra
A structural pattern commonly observed in social networks is homophily, the tendency for two nodes sharing a certain trait to be more (or sometimes less) likely to form a connection [27]. Homophily may occur on any number of traits, observed or latent, and is known to confound problems of causal inference in the social sciences [38; 36; 11; 23]. Homophily, meanwhile, lies at the heart of such issues as segregation [37; 14], job access [21], and political partisanship [20], where homophily on observed traits may be the subject of estimation in its own right. In order to fully understand the effects of network patterns like observed homophily, we first need to separate them from further latent network structure. In the literature on community detection, latent structure is frequently recovered through a clustering process involving only the network edges, reserving node covariates to validate the clustering results in an approach that conflates latent structure with observed structure [32]. What we wish to do instead is to separate the latent from the observed structural patterns. To this end, we consider an extension of the stochastic block model (SBM) [16] that incorporates homophily on observed, discrete node covariates into a generalized linear model (GLM). We define this model, which we call the additive-covariate SBM (ACSBM), in Section 2. The model was previously studied by Mele et al. [29] and allows for flexible modeling choices in which latent communities take a block model structure, covariates may or may not depend on community membership, and the effects of homophily may be modeled through a range of link functions. We give an explicit representation of this model as an SBM (Proposition 1), which motivates the use of spectral clustering to estimate the latent structure.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Centre County > University Park (0.04)
Efficient Online Inference for Bayesian Nonparametric Relational Models
Kim, Dae Il, Gopalan, Prem K., Blei, David, Sudderth, Erik
Stochastic block models characterize observed network relationships via latent community memberships. In large social networks, we expect entities to participate in multiple communities, and the number of communities to grow with the network size. We introduce a new model for these phenomena, the hierarchical Dirichlet process relational model, which allows nodes to have mixed membership in an unbounded set of communities. To allow scalable learning, we derive an online stochastic variational inference algorithm. Focusing on assortative models of undirected networks, we also propose an efficient structured mean field variational bound, and online methods for automatically pruning unused communities. Compared to state-of-the-art online learning methods for parametric relational models, we show significantly improved perplexity and link prediction accuracy for sparse networks with tens of thousands of nodes. We also showcase an analysis of LittleSis, a large network of who-knows-who at the heights of business and government.
The Nonparametric Metadata Dependent Relational Model
Kim, Dae Il, Hughes, Michael, Sudderth, Erik
We introduce the nonparametric metadata dependent relational (NMDR) model, a Bayesian nonparametric stochastic block model for network data. The NMDR allows the entities associated with each node to have mixed membership in an unbounded collection of latent communities. Learned regression models allow these memberships to depend on, and be predicted from, arbitrary node metadata. We develop efficient MCMC algorithms for learning NMDR models from partially observed node relationships. Retrospective MCMC methods allow our sampler to work directly with the infinite stick-breaking representation of the NMDR, avoiding the need for finite truncations. Our results demonstrate recovery of useful latent communities from real-world social and ecological networks, and the usefulness of metadata in link prediction tasks.
- Asia > Middle East > Jordan (0.05)
- Oceania > New Zealand (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (2 more...)
- Health & Medicine (0.46)
- Law (0.46)
- Information Technology (0.34)