cheeger
Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts
Spectral clustering is based on the spectral relaxation of the normalized/ratio graph cut criterion. While the spectral relaxation is known to be loose, it has been shown recently that a non-linear eigenproblem yields a tight relaxation of the Cheeger cut. In this paper, we extend this result considerably by providing a characterization of all balanced graph cuts which allow for a tight relaxation. Although the resulting optimization problems are non-convex and non-smooth, we provide an efficient first-order scheme which scales to large graphs. Moreover, our approach comes with the quality guarantee that given any partition as initialization the algorithm either outputs a better partition or it stops immediately.
- Europe > Germany > Saarland > Saarbrücken (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
A Simple Proof of the Mixing of Metropolis-Adjusted Langevin Algorithm under Smoothness and Isoperimetry
Chen, Yuansi, Gatmiry, Khashayar
We study the mixing time of Metropolis-Adjusted Langevin algorithm (MALA) for sampling a target density on $\mathbb{R}^d$. We assume that the target density satisfies $\psi_\mu$-isoperimetry and that the operator norm and trace of its Hessian are bounded by $L$ and $\Upsilon$ respectively. Our main result establishes that, from a warm start, to achieve $\epsilon$-total variation distance to the target density, MALA mixes in $O\left(\frac{(L\Upsilon)^{\frac12}}{\psi_\mu^2} \log\left(\frac{1}{\epsilon}\right)\right)$ iterations. Notably, this result holds beyond the log-concave sampling setting and the mixing time depends on only $\Upsilon$ rather than its upper bound $L d$. In the $m$-strongly logconcave and $L$-log-smooth sampling setting, our bound recovers the previous minimax mixing bound of MALA~\cite{wu2021minimax}.
Hypergraphs with Edge-Dependent Vertex Weights: Spectral Clustering based on the 1-Laplacian
Zhu, Yu, Li, Boning, Segarra, Santiago
We propose a flexible framework for defining the 1-Laplacian of a hypergraph that incorporates edge-dependent vertex weights. These weights are able to reflect varying importance of vertices within a hyperedge, thus conferring the hypergraph model higher expressivity than homogeneous hypergraphs. We then utilize the eigenvector associated with the second smallest eigenvalue of the hypergraph 1-Laplacian to cluster the vertices. From a theoretical standpoint based on an adequately defined normalized Cheeger cut, this procedure is expected to achieve higher clustering accuracy than that based on the traditional Laplacian. Indeed, we confirm that this is the case using real-world datasets to demonstrate the effectiveness of the proposed spectral clustering approach. Moreover, we show that for a special case within our framework, the corresponding hypergraph 1-Laplacian is equivalent to the 1-Laplacian of a related graph, whose eigenvectors can be computed more efficiently, facilitating the adoption on larger datasets.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
From graph cuts to isoperimetric inequalities: Convergence rates of Cheeger cuts on data clouds
Trillos, Nicolas Garcia, Murray, Ryan, Thorpe, Matthew
In this work we study statistical properties of graph-based clustering algorithms that rely on the optimization of balanced graph cuts, the main example being the optimization of Cheeger cuts. We consider proximity graphs built from data sampled from an underlying distribution supported on a generic smooth compact manifold $M$. In this setting, we obtain high probability convergence rates for both the Cheeger constant and the associated Cheeger cuts towards their continuum counterparts. The key technical tools are careful estimates of interpolation operators which lift empirical Cheeger cuts to the continuum, as well as continuum stability estimates for isoperimetric problems. To our knowledge the quantitative estimates obtained here are the first of their kind.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Indiana (0.04)
- (9 more...)
- Instructional Material > Course Syllabus & Notes (0.67)
- Research Report (0.63)
A Notion of Harmonic Clustering in Simplicial Complexes
Ebli, Stefania, Spreemann, Gard
We outline a novel clustering scheme for simplicial complexes that produces clusters of simplices in a way that is sensitive to the homology of the complex. The method is inspired by, and can be seen as a higher-dimensional version of, graph spectral clustering. The algorithm involves only sparse eigenproblems, and is therefore computationally efficient. We believe that it has broad application as a way to extract features from simplicial complexes that often arise in topological data analysis.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- North America > United States > California (0.04)
- (2 more...)
State Compression of Markov Processes via Empirical Low-Rank Estimation
Dimension reduction is a central problem in system engineering and data science. In scientific studies or engineering applications, one often needs to interact with unknown complex systems about which many noisy observations of system characteristics and system trajectories are available. The exact structures and dynamics of the system are typically masked by massive observations of noisy variables, many of which might not be relevant to the physical state of the system. It is often unclear how to describe the "state" of a system, when one can only access noisy observations. One may view each unique observation as a single state, however, this would generate a huge-or even infinite-dimensional process which is difficult to model or analyze. Although there exists a vast body of literatures on time series analysis [18], they typically require knowledge of specific models and might perform poorly when the models are misspecified. Anru Zhang is Assistant Professor, Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, Email: anruzhang@stat.wisc.edu; Mengdi Wang is Assistant Professor, Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544, Email: mengdiw@princeton.edu.
- North America > United States > Wisconsin > Dane County > Madison (0.54)
- North America > United States > New Jersey > Mercer County > Princeton (0.24)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Improved Cheeger's Inequality: Analysis of Spectral Partitioning Algorithms through Higher Order Spectral Gap
Kwok, Tsz Chiu, Lau, Lap Chi, Lee, Yin Tat, Gharan, Shayan Oveis, Trevisan, Luca
Let \phi(G) be the minimum conductance of an undirected graph G, and let 0=\lambda_1 <= \lambda_2 <=... <= \lambda_n <= 2 be the eigenvalues of the normalized Laplacian matrix of G. We prove that for any graph G and any k >= 2, \phi(G) = O(k) \lambda_2 / \sqrt{\lambda_k}, and this performance guarantee is achieved by the spectral partitioning algorithm. This improves Cheeger's inequality, and the bound is optimal up to a constant factor for any k. Our result shows that the spectral partitioning algorithm is a constant factor approximation algorithm for finding a sparse cut if \lambda_k$ is a constant for some constant k. This provides some theoretical justification to its empirical performance in image segmentation and clustering problems. We extend the analysis to other graph partitioning problems, including multi-way partition, balanced separator, and maximum cut.
- Asia > China > Hong Kong (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts
Spectral clustering is based on the spectral relaxation of the normalized/ratio graph cut criterion. While the spectral relaxation is known to be loose, it has been shown recently that a non-linear eigenproblem yields a tight relaxation of the Cheeger cut. In this paper, we extend this result considerably by providing a characterization of all balanced graph cuts which allow for a tight relaxation. Although the resulting optimization problems are non-convex and non-smooth, we provide an efficient first-order scheme which scales to large graphs. Moreover, our approach comes with the quality guarantee that given any partition as initialization the algorithm either outputs a better partition or it stops immediately.
- Europe > Germany > Saarland > Saarbrücken (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA
Hein, Matthias, Bühler, Thomas
Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with nonquadratic objective and constraints can be understood as nonlinear eigenproblems. We derive a generalization of the inverse power method which is guaranteed to converge to a nonlinear eigenvector. We apply the inverse power method to 1-spectral clustering and sparse PCA which can naturally be formulated as nonlinear eigenproblems. In both applications we achieve state-of-the-art results in terms of solution quality and runtime. Moving beyond the standard eigenproblem should be useful also in many other applications and our inverse power method can be easily adapted to new problems.
- North America > United States > New York (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA
Hein, Matthias, Bühler, Thomas
Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with nonquadratic objective and constraints can be understood as nonlinear eigenproblems. We derive a generalization of the inverse power method which is guaranteed to converge to a nonlinear eigenvector. We apply the inverse power method to 1-spectral clustering and sparse PCA which can naturally be formulated as nonlinear eigenproblems. In both applications we achieve state-of-the-art results in terms of solution quality and runtime. Moving beyond the standard eigenproblem should be useful also in many other applications and our inverse power method can be easily adapted to new problems.
- North America > United States > New York (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)