Goto

Collaborating Authors

 bethe hessian


Robust Spectral Detection of Global Structures in the Data by Learning a Regularization

Neural Information Processing Systems

Spectral methods are popular in detecting global structures in the given data that can be represented as a matrix. However when the data matrix is sparse or noisy, classic spectral methods usually fail to work, due to localization of eigenvectors (or singular vectors) induced by the sparsity or noise. In this work, we propose a general method to solve the localization problem by learning a regularization matrix from the localized eigenvectors. Using matrix perturbation analysis, we demonstrate that the learned regularizations suppress down the eigenvalues associated with localized eigenvectors and enable us to recover the informative eigenvectors representing the global structure. We show applications of our method in several inference problems: community detection in networks, clustering from pairwise similarities, rank estimation and matrix completion problems. Using extensive experiments, we illustrate that our method solves the localization problem and works down to the theoretical detectability limits in different kinds of synthetic data. This is in contrast with existing spectral algorithms based on data matrix, non-backtracking matrix, Laplacians and those with rank-one regularizations, which perform poorly in the sparse case with noise.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper presents a spectral clustering algorithm for the sparse degree regime (degree=\theta(1)) of the stochastic blockmodel. The authors propose an alternative to the non-backtracking operator and point out that this has similar properties as the non-backtracking operator, which has been shown to be useful in the sparse regime. But the proposed data matrix is smaller than the non backtracking operator, and symmetric, therefore making eigenvalue computation easier and more accurate. The sparse regime is indeed the hardest in terms of showing concentration of empirical eigenvalues, or performance of a clustering algorithm in the stochastic blockmodel.


Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation

Neural Information Processing Systems

The completion of low rank matrices from few entries is a task with many practical applications. We consider here two aspects of this problem: detectability, i.e. the ability to estimate the rank r reliably from the fewest possible random entries, and performance in achieving small reconstruction error. We propose a spectral algorithm for these two tasks called MaCBetH (for Matrix Completion with the Bethe Hessian). The rank is estimated as the number of negative eigenvalues of the Bethe Hessian matrix, and the corresponding eigenvectors are used as initial condition for the minimization of the discrepancy between the estimated matrix and the revealed entries. We analyze the performance in a random matrix setting using results from the statistical mechanics of the Hopfield neural network, and show in particular that MaCBetH efficiently detects the rank r of a large n m matrix from C (r)r nm entries, where C ( r) is a constant close to 1. We also evaluate the corresponding root-mean-square error empirically and show that MaCBetH compares favorably to other existing approaches. Matrix completion is the task of inferring the missing entries of a matrix given a subset of known entries. Typically, this is possible because the matrix to be completed has (at least approximately) low rank r . This problem has witnessed a burst of activity, see e.g.


Spectral Clustering of graphs with the Bethe Hessian

Neural Information Processing Systems

Spectral clustering is a standard approach to label nodes on a graph by studying the (largest or lowest) eigenvalues of a symmetric real matrix such as e.g. the adjacency or the Laplacian. Recently, it has been argued that using instead a more complicated, non-symmetric and higher dimensional operator, related to the non-backtracking walk on the graph, leads to improved performance in detecting clusters, and even to optimal performance for the stochastic block model. Here, we propose to use instead a simpler object, a symmetric real matrix known as the Bethe Hessian operator, or deformed Laplacian. We show that this approach combines the performances of the non-backtracking operator, thus detecting clusters all the way down to the theoretical limit in the stochastic block model, with the computational, theoretical and memory advantages of real symmetric matrices. Clustering a graph into groups or functional modules (sometimes called communities) is a central task in many fields ranging from machine learning to biology. A common benchmark for this problem is to consider graphs generated by the stochastic block model (SBM) [7, 22].


Learning unbelievable probabilities Yashar Ahmadian Department of Brain and Cognitive Science Center for Theoretical Neuroscience University of Rochester Columbia University Rochester, NY14607

Neural Information Processing Systems

Loopy belief propagation performs approximate inference on graphical models with loops. One might hope to compensate for the approximation by adjusting model parameters. Learning algorithms for this purpose have been explored previously, and the claim has been made that every set of locally consistent marginals can arise from belief propagation run on a graphical model. On the contrary, here we show that many probability distributions have marginals that cannot be reached by belief propagation using any set of model parameters or any learning algorithm.


Spectral Clustering of Graphs with the Bethe Hessian

Neural Information Processing Systems

Spectral clustering is a standard approach to label nodes on a graph by studying the (largest or lowest) eigenvalues of a symmetric real matrix such as e.g. the adjacency or the Laplacian. Recently, it has been argued that using instead a more complicated, non-symmetric and higher dimensional operator, related to the non-backtracking walk on the graph, leads to improved performance in detecting clusters, and even to optimal performance for the stochastic block model. Here, we propose to use instead a simpler object, a symmetric real matrix known as the Bethe Hessian operator, or deformed Laplacian. We show that this approach combines the performances of the non-backtracking operator, thus detecting clusters all the way down to the theoretical limit in the stochastic block model, with the computational, theoretical and memory advantages of real symmetric matrices. Clustering a graph into groups or functional modules (sometimes called communities) is a central task in many fields ranging from machine learning to biology. A common benchmark for this problem is to consider graphs generated by the stochastic block model (SBM) [7, 22].


Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation Sorbonne Universités, Université Pierre et Marie Curie Paris 06, F-75005, Paris, France

Neural Information Processing Systems

The completion of low rank matrices from few entries is a task with many practical applications. We consider here two aspects of this problem: detectability, i.e. the ability to estimate the rank r reliably from the fewest possible random entries, and performance in achieving small reconstruction error. We propose a spectral algorithm for these two tasks called MaCBetH (for Matrix Completion with the Bethe Hessian). The rank is estimated as the number of negative eigenvalues of the Bethe Hessian matrix, and the corresponding eigenvectors are used as initial condition for the minimization of the discrepancy between the estimated matrix and the revealed entries. We analyze the performance in a random matrix setting using results from the statistical mechanics of the Hopfield neural network, and show in particular that MaCBetH efficiently detects the rank r of a large n m matrix from C(r)r nm entries, where C(r) is a constant close to 1. We also evaluate the corresponding root-mean-square error empirically and show that MaCBetH compares favorably to other existing approaches. Matrix completion is the task of inferring the missing entries of a matrix given a subset of known entries. Typically, this is possible because the matrix to be completed has (at least approximately) low rank r. This problem has witnessed a burst of activity, see e.g.


Robust Spectral Detection of Global Structures in the Data by Learning a Regularization

Neural Information Processing Systems

Spectral methods are popular in detecting global structures in the given data that can be represented as a matrix. However when the data matrix is sparse or noisy, classic spectral methods usually fail to work, due to localization of eigenvectors (or singular vectors) induced by the sparsity or noise. In this work, we propose a general method to solve the localization problem by learning a regularization matrix from the localized eigenvectors. Using matrix perturbation analysis, we demonstrate that the learned regularizations suppress down the eigenvalues associated with localized eigenvectors and enable us to recover the informative eigenvectors representing the global structure. We show applications of our method in several inference problems: community detection in networks, clustering from pairwise similarities, rank estimation and matrix completion problems. Using extensive experiments, we illustrate that our method solves the localization problem and works down to the theoretical detectability limits in different kinds of synthetic data. This is in contrast with existing spectral algorithms based on data matrix, non-backtracking matrix, Laplacians and those with rank-one regularizations, which perform poorly in the sparse case with noise.


Consistency between ordering and clustering methods for graphs

arXiv.org Artificial Intelligence

A relational dataset is often analyzed by optimally assigning a label to each element through clustering or ordering. While similar characterizations of a dataset would be achieved by both clustering and ordering methods, the former has been studied much more actively than the latter, particularly for the data represented as graphs. This study fills this gap by investigating methodological relationships between several clustering and ordering methods, focusing on spectral techniques. Furthermore, we evaluate the resulting performance of the clustering and ordering methods. To this end, we propose a measure called the label continuity error, which generically quantifies the degree of consistency between a sequence and partition for a set of elements. Based on synthetic and real-world datasets, we evaluate the extents to which an ordering method identifies a module structure and a clustering method identifies a banded structure.


Optimized Deformed Laplacian for Spectrum-based Community Detection in Sparse Heterogeneous Graphs

arXiv.org Machine Learning

Spectral clustering is one of the most popular, yet still incompletely understood, methods for community detection on graphs. In this article we study spectral clustering based on the deformed Laplacian matrix $D-rA$, for sparse heterogeneous graphs (following a two-class degree-corrected stochastic block model). For a specific value $r = \zeta$, we show that, unlike competing methods such as the Bethe Hessian or non-backtracking operator approaches, clustering is insensitive to the graph heterogeneity. Based on heuristic arguments, we study the behavior of the informative eigenvector of $D-\zeta A$ and, as a result, we accurately predict the clustering accuracy. Via extensive simulations and application to real networks, the resulting clustering algorithm is validated and observed to systematically outperform state-of-the-art competing methods.