block pair
Supplemental Material: CHIP: AHawkes Process Model for Continuous-time Networkswith Scalable and Consistent Estimation
A.1 CommunityDetection The spectral clustering algorithm for directed networks that we consider in this paper is shown in Algorithm A.1. It can be applied either to the weighted adjacency (count) matrixN or the unweighted adjacency matrixA, where Aij =1{Nij >0} and 1{ } denotes the indicator function of the argument. This algorithm is used for the community detection step in our proposed CHIP estimationprocedure. For undirectednetworks, which we use for the theoreticalanalysisin Section 4, spectral clustering is performed by running k-means clustering on the rows of theeigenvector matrix of N or A, not the rows of the concatenated singular vector matrix. A.2 Estimation of Hawkes process parameters Ozaki (1979) derived the log-likelihood function for Hawkes processes with exponential kernels, which takes the form: logL= µT+ The threeparameters µ,α,β can be estimatedby maximizing (A.1) using standard numerical methods for non-linear optimization (Nocedal & Wright, 2006). We provide closed-form equations for estimating mab =αab/βab and µab in (2).
- North America > United States > Ohio (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Ohio (0.04)
- North America > Canada (0.04)
- Research Report > Strength High (0.46)
- Research Report > Experimental Study (0.46)
Supplemental Material: CHIP: A Hawkes Process Model for Continuous-time Networks with Scalable and Consistent Estimation
The spectral clustering algorithm for directed networks that we consider in this paper is shown in Algorithm A.1. Theorem B.1 provides an upper bound to the error rate of spectral clustering on the weighted The effect of this term is negligible as T, so we ignore it. We now present an upper bound on the error rate for communities (analogous to Theorem B.1) estimated from the unweighted adjacency matrix The upper bounds on the error rates in Theorems B.1 and B.2 are not very informative in terms of In Section 4.1, we considered a simplified special case Similarly, we have the following result for spectral clustering using the unweighted adjacency matrix A . 3 Theorem B.3. Hence the unweighted adjacency matrix has a 1 in almost all entries, and the community structure cannot be detected from this matrix. The density of the aggregate adjacency matrix is governed by the parameters of the CHIP model.
- North America > United States > Ohio (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Ohio (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Research Report > Strength High (0.46)
- Research Report > Experimental Study (0.46)
Spectral clustering for dependent community Hawkes process models of temporal networks
Zhao, Lingfei, Soliman, Hadeel, Xu, Kevin S., Paul, Subhadeep
Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations, where an interaction event from a sender to a receiver node increases the possibility of future events among other node pairs. We provide statistical results for a class of models that we call dependent community Hawkes (DCH) models, which combine the stochastic block model with mutually exciting Hawkes processes for modeling both community structure and dependence among node pairs, respectively. We derive a non-asymptotic upper bound on the misclustering error of spectral clustering on the event count matrix as a function of the number of nodes and communities, time duration, and the amount of dependence in the model. Our result leverages recent results on bounding an appropriate distance between a multivariate Hawkes process count vector and a Gaussian vector, along with results from random matrix theory. We also propose a DCH model that incorporates only self and reciprocal excitation along with highly scalable parameter estimation using a Generalized Method of Moments (GMM) estimator that we demonstrate to be consistent for growing network size and time duration.
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > Ohio > Lucas County > Toledo (0.04)
- North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
- (2 more...)
- Government (0.48)
- Banking & Finance (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Data Science > Data Mining (0.94)
- Information Technology > Communications > Social Media (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
The Multivariate Community Hawkes Model for Dependent Relational Events in Continuous-time Networks
Soliman, Hadeel, Zhao, Lingfei, Huang, Zhipeng, Paul, Subhadeep, Xu, Kevin S.
The stochastic block model (SBM) is one of the most widely used generative models for network data. Many continuous-time dynamic network models are built upon the same assumption as the SBM: edges or events between all pairs of nodes are conditionally independent given the block or community memberships, which prevents them from reproducing higher-order motifs such as triangles that are commonly observed in real networks. We propose the multivariate community Hawkes (MULCH) model, an extremely flexible community-based model for continuous-time networks that introduces dependence between node pairs using structured multivariate Hawkes processes. We fit the model using a spectral clustering and likelihood-based local refinement procedure. We find that our proposed MULCH model is far more accurate than existing models both for predictive and generative tasks.
- Africa > Middle East > Libya (0.04)
- Asia > Middle East > UAE (0.04)
- North America > United States > Ohio > Lucas County > Toledo (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- Information Technology (0.66)
- Government (0.46)
Nondiagonal Mixture of Dirichlet Network Distributions for Analyzing a Stock Ownership Network
Zhang, Wenning, Hisano, Ryohei, Ohnishi, Takaaki, Mizuno, Takayuki
Block modeling is widely used in studies on complex networks. The cornerstone model is the stochastic block model (SBM), widely used over the past decades. However, the SBM is limited in analyzing complex networks as the model is, in essence, a random graph model that cannot reproduce the basic properties of many complex networks, such as sparsity and heavy-tailed degree distribution. In this paper, we provide an edge exchangeable block model that incorporates such basic features and simultaneously infers the latent block structure of a given complex network. Our model is a Bayesian nonparametric model that flexibly estimates the number of blocks and takes into account the possibility of unseen nodes. Using one synthetic dataset and one real-world stock ownership dataset, we show that our model outperforms state-of-the-art SBMs for held-out link prediction tasks.
- North America > Mexico > Gulf of Mexico (0.29)
- North America > United States > Texas (0.04)
- North America > United States > New Mexico (0.04)
- (7 more...)
Consistent Community Detection in Continuous-Time Networks of Relational Events
Arastuie, Makan, Paul, Subhadeep, Xu, Kevin S.
In many application settings involving networks, such as messages between users of an on-line social network or transactions between traders in financial markets, the observed data are in the form of relational events with timestamps, which form a continuous-time network. We propose the Community Hawkes Independent Pairs (CHIP) model for community detection on such timestamped relational event data. We demonstrate that applying spectral clustering to adjacency matrices constructed from relational events generated by the CHIP model provides consistent community detection for a growing number of nodes. In particular, we obtain explicit non-asymptotic upper bounds on the misclustering rates based on the separation conditions required on the parameters of the model for consistent community detection. We also develop consistent and computationally efficient estimators for the parameters of the model. We demonstrate that our proposed CHIP model and estimation procedure scales to large networks with tens of thousands of nodes and provides superior fits compared to existing continuous-time network models on several real networks.
- North America > United States > Ohio > Lucas County > Toledo (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Banking & Finance (0.86)
- Information Technology > Services (0.34)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)