dense subgraph
The Lovász ϑ function, SVMs and finding large dense subgraphs
The Lovász ϑ function of a graph, a fundamental tool in combinatorial optimization and approximation algorithms, is computed by solving a SDP. In this paper we establish that the Lovász ϑ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM ϑ graphs, on which the Lovász ϑ function can be approximated well by a one-class SVM. This leads to novel use of SVM techniques for solving algorithmic problems in large graphs e.g.
The Lovász ϑ function, SVMs and finding large dense subgraphs
The Lovasz \theta function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing \theta involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz \theta function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call SVM-\theta graphs, on which the Lovasz \theta function can be approximated well by a one-class SVM.
Faster Algorithms for Generalized Mean Densest Subgraph Problem
Fan, Chenglin, Li, Ping, Peng, Hanyu
The densest subgraph of a large graph usually refers to some subgraph with the highest average degree, which has been extended to the family of $p$-means dense subgraph objectives by~\citet{veldt2021generalized}. The $p$-mean densest subgraph problem seeks a subgraph with the highest average $p$-th-power degree, whereas the standard densest subgraph problem seeks a subgraph with a simple highest average degree. It was shown that the standard peeling algorithm can perform arbitrarily poorly on generalized objective when $p>1$ but uncertain when $0
Detection of Dense Subhypergraphs by Low-Degree Polynomials
Dhawan, Abhishek, Mao, Cheng, Wein, Alexander S.
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem that has been extensively studied in recent years. We study a hypergraph version of the problem. Let $G^r(n,p)$ denote the $r$-uniform Erd\H{o}s-R\'enyi hypergraph model with $n$ vertices and edge density $p$. We consider detecting the presence of a planted $G^r(n^\gamma, n^{-\alpha})$ subhypergraph in a $G^r(n, n^{-\beta})$ hypergraph, where $0< \alpha < \beta < r-1$ and $0 < \gamma < 1$. Focusing on tests that are degree-$n^{o(1)}$ polynomials of the entries of the adjacency tensor, we determine the threshold between the easy and hard regimes for the detection problem. More precisely, for $0 < \gamma < 1/2$, the threshold is given by $\alpha = \beta \gamma$, and for $1/2 \le \gamma < 1$, the threshold is given by $\alpha = \beta/2 + r(\gamma - 1/2)$. Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known. Our proof of low-degree hardness is based on a conditional variant of the standard low-degree likelihood calculation.
A Survey on the Densest Subgraph Problem and its Variants
Lanciano, Tommaso, Miyauchi, Atsushi, Fazzone, Adriano, Bonchi, Francesco
The Densest Subgraph Problem requires to find, in a given graph, a subset of vertices whose induced subgraph maximizes a measure of density. The problem has received a great deal of attention in the algorithmic literature over the last five decades, with many variants proposed and many applications built on top of this basic definition. Recent years have witnessed a revival of research interest on this problem with several interesting contributions, including some groundbreaking results, published in 2022 and 2023. This survey provides a deep overview of the fundamental results and an exhaustive coverage of the many variants proposed in the literature, with a special attention on the most recent results. The survey also presents a comprehensive overview of applications and discusses some interesting open problems for this evergreen research topic.
Spectral Algorithms Optimally Recover Planted Sub-structures
Dhara, Souvik, Gaudio, Julia, Mossel, Elchanan, Sandon, Colin
Spectral algorithms are an important building block in machine learning and graph algorithms. We are interested in studying when such algorithms can be applied directly to provide optimal solutions to inference tasks. Previous works by Abbe, Fan, Wang and Zhong (2020) and by Dhara, Gaudio, Mossel and Sandon (2022) showed the optimality for community detection in the Stochastic Block Model (SBM), as well as in a censored variant of the SBM. Here we show that this optimality is somewhat universal as it carries over to other planted substructures such as the planted dense subgraph problem and submatrix localization problem, as well as to a censored version of the planted dense subgraph problem.
A practical test for a planted community in heterogeneous networks
One of the fundamental task in graph data mining is to find a planted community(dense subgraph), which has wide application in biology, finance, spam detection and so on. For a real network data, the existence of a dense subgraph is generally unknown. Statistical tests have been devised to testing the existence of dense subgraph in a homogeneous random graph. However, many networks present extreme heterogeneity, that is, the degrees of nodes or vertexes don't concentrate on a typical value. The existing tests designed for homogeneous random graph are not straightforwardly applicable to the heterogeneous case. Recently, scan test was proposed for detecting a dense subgraph in heterogeneous(inhomogeneous) graph(\cite{BCHV19}). However, the computational complexity of the scan test is generally not polynomial in the graph size, which makes the test impractical for large or moderate networks. In this paper, we propose a polynomial-time test that has the standard normal distribution as the null limiting distribution. The power of the test is theoretically investigated and we evaluate the performance of the test by simulation and real data example.
Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College
Li, Chen, Peng, Xutan, Peng, Hao, Li, Jianxin, Wang, Lihong, Yu, Philip S., He, Lifang
Recently, graph-based algorithms have drawn much attention because of their impressive success in semi-supervised setups. For better model performance, previous studies learn to transform the topology of the input graph. However, these works only focus on optimizing the original nodes and edges, leaving the direction of augmenting existing data unexplored. In this paper, by simulating the generation process of graph signals, we propose a novel heuristic pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. Substantially enlarging the original training set with high-quality generated labeled data, our framework can effectively benefit downstream models. To justify the generality and practicality of ELCO, we couple it with the popular Graph Convolution Network and Graph Attention Network to perform extensive evaluations on three standard datasets. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art. We release our code and data on https://github.com/RingBDStack/ELCO to guarantee reproducibility.
The Lovász ϑ function, SVMs and finding large dense subgraphs
Jethava, Vinay, Martinsson, Anders, Bhattacharyya, Chiranjib, Dubhashi, Devdatt
The Lovasz $\theta$ function of a graph, is a fundamental tool in combinatorial optimization and approximation algorithms. Computing $\theta$ involves solving a SDP and is extremely expensive even for moderately sized graphs. In this paper we establish that the Lovasz $\theta$ function is equivalent to a kernel learning problem related to one class SVM. This interesting connection opens up many opportunities bridging graph theoretic algorithms and machine learning. We show that there exist graphs, which we call $SVM-\theta$ graphs, on which the Lovasz $\theta$ function can be approximated well by a one-class SVM.
Computational Lower Bounds for Community Detection on Random Graphs
Hajek, Bruce, Wu, Yihong, Xu, Jiaming
This paper studies the problem of detecting the presence of a small dense community planted in a large Erd\H{o}s-R\'enyi random graph $\mathcal{G}(N,q)$, where the edge probability within the community exceeds $q$ by a constant factor. Assuming the hardness of the planted clique detection problem, we show that the computational complexity of detecting the community exhibits the following phase transition phenomenon: As the graph size $N$ grows and the graph becomes sparser according to $q=N^{-\alpha}$, there exists a critical value of $\alpha = \frac{2}{3}$, below which there exists a computationally intensive procedure that can detect far smaller communities than any computationally efficient procedure, and above which a linear-time procedure is statistically optimal. The results also lead to the average-case hardness results for recovering the dense community and approximating the densest $K$-subgraph.