AITopics | dcsbm

Collaborating Authors

dcsbm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ca9541826e97c4530b07dda2eba0e013-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 04:31:11 GMT

graph, node, proceedings, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Indiana (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

ca9541826e97c4530b07dda2eba0e013-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 09:17:38 GMT

data mining, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Indiana (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

A Kernelised Stein Discrepancy for Assessing the Fit of Inhomogeneous Random Graph Models

Fatima, Anum, Reinert, Gesine

arXiv.org Machine LearningMay-29-2025

Complex data are often represented as a graph, which in turn can often be viewed as a realisation of a random graph, such as of an inhomogeneous random graph model (IRG). For general fast goodness-of-fit tests in high dimensions, kernelised Stein discrepancy (KSD) tests are a powerful tool. Here, we develop, test, and analyse a KSD-type goodness-of-fit test for IRG models that can be carried out with a single observation of the network. The test is applicable to a network of any size and does not depend on the asymptotic distribution of the test statistic. We also provide theoretical guarantees.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Machine Learning

2505.2158

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
Asia > Pakistan > Punjab > Lahore Division > Lahore (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Health & Medicine (1.00)
Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

Selecting the Number of Communities for Weighted Degree-Corrected Stochastic Block Models

Liu, Yucheng, Li, Xiaodong

arXiv.org Machine LearningJun-7-2024

We investigate how to select the number of communities for weighted networks without a full likelihood modeling. First, we propose a novel weighted degree-corrected stochastic block model (DCSBM), in which the mean adjacency matrix is modeled as the same as in standard DCSBM, while the variance profile matrix is assumed to be related to the mean adjacency matrix through a given variance function. Our method of selection the number of communities is based on a sequential testing framework, in each step the weighed DCSBM is fitted via some spectral clustering method. A key step is to carry out matrix scaling on the estimated variance profile matrix. The resulting scaling factors can be used to normalize the adjacency matrix, from which the testing statistic is obtained. Under mild conditions on the weighted DCSBM, our proposed procedure is shown to be consistent in estimating the true number of communities. Numerical experiments on both simulated and real network data also demonstrate the desirable empirical properties of our method.

adjacency matrix, assumption 1, matrix, (14 more...)

arXiv.org Machine Learning

2406.0534

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Yolo County > Davis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Systematic assessment of the quality of fit of the stochastic block model for empirical networks

Vaca-Ramírez, Felipe, Peixoto, Tiago P.

arXiv.org Machine LearningJan-5-2022

We perform a systematic analysis of the quality of fit of the stochastic block model (SBM) for 275 empirical networks spanning a wide range of domains and orders of size magnitude. We employ posterior predictive model checking as a criterion to assess the quality of fit, which involves comparing networks generated by the inferred model with the empirical network, according to a set of network descriptors. We observe that the SBM is capable of providing an accurate description for the majority of networks considered, but falls short of saturating all modeling requirements. In particular, networks possessing a large diameter and slow-mixing random walks tend to be badly described by the SBM. However, contrary to what is often assumed, networks with a high abundance of triangles can be well described by the SBM in many cases. We demonstrate that simple network descriptors can be used to evaluate whether or not the SBM can provide a sufficiently accurate representation, potentially pointing to possible model extensions that can systematically improve the expressiveness of this class of models.

bipartite network, dcsbm, descriptor, (14 more...)

arXiv.org Machine Learning

2201.01658

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > New Zealand (0.04)
(26 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Information Technology (1.00)
(6 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(5 more...)

Add feedback

Adjusted chi-square test for degree-corrected block models

Zhang, Linfan, Amini, Arash A.

arXiv.org Machine LearningDec-30-2020

We propose a goodness-of-fit test for degree-corrected stochastic block models (DCSBM). The test is based on an adjusted chi-square statistic for measuring equality of means among groups of $n$ multinomial distributions with $d_1,\dots,d_n$ observations. In the context of network models, the number of multinomials, $n$, grows much faster than the number of observations, $d_i$, hence the setting deviates from classical asymptotics. We show that a simple adjustment allows the statistic to converge in distribution, under null, as long as the harmonic mean of $\{d_i\}$ grows to infinity. This result applies to large sparse networks where the role of $d_i$ is played by the degree of node $i$. Our distributional results are nonasymptotic, with explicit constants, providing finite-sample bounds on the Kolmogorov-Smirnov distance to the target distribution. When applied sequentially, the test can also be used to determine the number of communities. The test operates on a (row) compressed version of the adjacency matrix, conditional on the degrees, and as a result is highly scalable to large sparse networks. We incorporate a novel idea of compressing the columns based on a $(K+1)$-community assignment when testing for $K$ communities. This approach increases the power in sequential applications without sacrificing computational efficiency, and we prove its consistency in recovering the number of communities. Since the test statistic does not rely on a specific alternative, its utility goes beyond sequential testing and can be used to simultaneously test against a wide range of alternatives outside the DCSBM family. We show the effectiveness of the approach by extensive numerical experiments with simulated and real data. In particular, applying the test to the Facebook-100 dataset, we find that a DCSBM with a small number of communities is far from a good fit in almost all cases.

dcsbm, matrix, statistic, (13 more...)

arXiv.org Machine Learning

2012.15047

Country:

North America > United States > Maryland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.45)
Research Report > Promising Solution (0.34)

Industry: Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Communications > Networks (0.88)
(4 more...)

Add feedback

An improved spectral clustering method for community detection under the degree-corrected stochastic blockmodel

Qing, Huan, Wang, Jingli

arXiv.org Machine LearningNov-12-2020

To solve the community detection problem, substantial approaches, such as Snijders and Nowicki (1997); Nowicki and Snijders (2001); Daudin et al. (2008); Bickel and Chen (2009); Rohe et al. (2011); Amini et al. (2013), are designed based on the standard framework, the stochastic block model (SBM) (Holland et al. (1983)), since it is mathematically simple and relatively easy to analyze (Bickel and Chen (2009)). However, the assumptions of SBM are too restrictive to implement in real networks. It is assumed that the distribution of degrees within the community is Poisson, that is, the nodes within each community have the same expected degrees. Unfortunately, in many natural networks, the degrees follow approximately a power-law distribution (Kolaczyk (2009); Goldenberg et al. (2010); Jin(2015)). The corrected-degree stochastic block model (DCSBM) (Karrer and Newman (2011)) is developed based on the power-law distribution which allows the degree of nodes varies among different communities.

eigenvalue, experiment 2, weak signal network, (14 more...)

arXiv.org Machine Learning

2011.06374

Country:

North America > United States (0.28)
Asia > China (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.83)

Add feedback

Dual regularized Laplacian spectral clustering methods on community detection

Qing, Huan, Wang, Jingli

arXiv.org Machine LearningNov-9-2020

Spectral clustering methods are widely used for detecting clusters in networks for community detection, while a small change on the graph Laplacian matrix could bring a dramatic improvement. In this paper, we propose a dual regularized graph Laplacian matrix and then employ it to three classical spectral clustering approaches under the degree-corrected stochastic block model. If the number of communities is known as $K$, we consider more than $K$ leading eigenvectors and weight them by their corresponding eigenvalues in the spectral clustering procedure to improve the performance. Three improved spectral clustering methods are dual regularized spectral clustering (DRSC) method, dual regularized spectral clustering on Ratios-of-eigenvectors (DRSCORE) method, and dual regularized symmetrized Laplacian inverse matrix (DRSLIM) method. Theoretical analysis of DRSC and DRSLIM show that under mild conditions DRSC and DRSLIM yield stable consistent community detection, moreover, DRSCORE returns perfect clustering under the ideal case. We compare the performances of DRSC, DRSCORE and DRSLIM with several spectral methods by substantial simulated networks and eight real-world networks.

drslim, matrix, spectral, (16 more...)

arXiv.org Machine Learning

2011.04392

Country:

North America > United States (0.67)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Leisure & Entertainment > Sports > Football (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Community Detection by Principal Components Clustering Methods

Qing, Huan, Wang, Jingli

arXiv.org Machine LearningNov-9-2020

Based on the classical Degree Corrected Stochastic Blockmodel (DCSBM) model for network community detection problem, we propose two novel approaches: principal component clustering (PCC) and normalized principal component clustering (NPCC). Without any parameters to be estimated, the PCC method is simple to be implemented. Under mild conditions, we show that PCC yields consistent community detection. NPCC is designed based on the combination of the PCC and the RSC method (Qin & Rohe 2013). Population analysis for NPCC shows that NPCC returns perfect clustering for the ideal case under DCSBM. PCC and NPCC is illustrated through synthetic and real-world datasets. Numerical results show that NPCC provides a significant improvement compare with PCC and RSC. Moreover, NPCC inherits nice properties of PCC and RSC such that NPCC is insensitive to the number of eigenvectors to be clustered and the choosing of the tuning parameter. When dealing with two weak signal networks Simmons and Caltech, by considering one more eigenvectors for clustering, we provide two refinements PCC+ and NPCC+ of PCC and NPCC, respectively. Both two refinements algorithms provide improvement performances compared with their original algorithms. Especially, NPCC+ provides satisfactory performances on Simmons and Caltech, with error rates of 121/1137 and 96/590, respectively.

eigenvector, matrix, npcc, (15 more...)

arXiv.org Machine Learning

2011.04377

Country:

North America > United States (0.46)
Asia > China (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.65)

Add feedback

Edge Correlations in Multilayer Networks

Pamfil, A. Roxana, Howison, Sam D., Porter, Mason A.

arXiv.org Machine LearningAug-11-2019

Many recent developments in network analysis have focused on multilayer networks, which one can use to encode time-dependent interactions, multiple types of interactions, and other complications that arise in complex systems. Like their monolayer counterparts, multilayer networks in applications often have mesoscale features, such as community structure. A prominent type of method for inferring such structures is the employment of multilayer stochastic block models (SBMs). A common (but inadequate) assumption of these models is the sampling of edges in different layers independently, conditioned on community labels of the nodes. In this paper, we relax this assumption of independence by incorporating edge correlations into an SBM-like model. We derive maximum-likelihood estimates of the key parameters of our model, and we propose a measure of layer correlation that reflects the similarity between connectivity patterns in different layers. Finally, we explain how to use correlated models for edge prediction in multilayer networks. By taking into account edge correlations, prediction accuracy improves both in synthetic networks and in a temporal network of shoppers who are connected to previously-purchased grocery products.

artificial intelligence, correlation, machine learning, (20 more...)

arXiv.org Machine Learning

1908.03875

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)

Genre: Research Report (0.63)

Industry:

Consumer Products & Services (0.93)
Health & Medicine (0.68)
Transportation > Air (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback