AITopics | Hein, Matthias

Constrained fractional set programs and their application in local clustering and community detection

Bühler, Thomas, Rangapuram, Syama Sundar, Setzer, Simon, Hein, Matthias

arXiv.org Machine LearningJun-14-2013

The (constrained) minimization of a ratio of set functions is a problem frequently occurring in clustering and community detection. As these optimization problems are typically NP-hard, one uses convex or spectral relaxations in practice. While these relaxations can be solved globally optimally, they are often too loose and thus lead to results far away from the optimum. In this paper we show that every constrained minimization problem of a ratio of non-negative set functions allows a tight relaxation into an unconstrained continuous optimization problem. This result leads to a flexible framework for solving constrained problems in network analysis. While a globally optimal solution for the resulting non-convex problem cannot be guaranteed, we outperform the loose convex or spectral relaxations by a large margin on constrained local clustering problems.

artificial intelligence, constraint, optimization problem, (19 more...)

arXiv.org Machine Learning

1306.3409

Country:

North America > United States (0.28)
Europe > Germany > Saarland (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts

Hein, Matthias, Setzer, Simon

Neural Information Processing SystemsDec-31-2011

Spectral clustering is based on the spectral relaxation of the normalized/ratio graph cut criterion. While the spectral relaxation is known to be loose, it has been shown recently that a non-linear eigenproblem yields a tight relaxation of the Cheeger cut. In this paper, we extend this result considerably by providing a characterization of all balanced graph cuts which allow for a tight relaxation. Although the resulting optimization problems are non-convex and non-smooth, we provide an efficient first-order scheme which scales to large graphs. Moreover, our approach comes with the quality guarantee that given any partition as initialization the algorithm either outputs a better partition or it stops immediately.

artificial intelligence, lovasz extension, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany > Saarland (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Sparse recovery by thresholded non-negative least squares

Slawski, Martin, Hein, Matthias

Neural Information Processing SystemsDec-31-2011

Non-negative data are commonly encountered in numerous fields, making non-negative least squares regression (NNLS) a frequently used tool. At least relative to its simplicity, it often performs rather well in practice. Serious doubts about its usefulness arise for modern high-dimensional linear models. Even in this setting - unlike first intuition may suggest - we show that for a broad class of designs, NNLS is resistant to overfitting and works excellently for sparse recovery when combined with thresholding, experimentally even outperforming L1-regularization. Since NNLS also circumvents the delicate choice of a regularization parameter, our findings suggest that NNLS may be the method of choice.

artificial intelligence, machine learning, recovery, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Germany > Saarland (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

Hein, Matthias, Bühler, Thomas

Neural Information Processing SystemsDec-31-2010

Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with nonquadratic objective and constraints can be understood as nonlinear eigenproblems. We derive a generalization of the inverse power method which is guaranteed to converge to a nonlinear eigenvector. We apply the inverse power method to 1-spectral clustering and sparse PCA which can naturally be formulated as nonlinear eigenproblems. In both applications we achieve state-of-the-art results in terms of solution quality and runtime. Moving beyond the standard eigenproblem should be useful also in many other applications and our inverse power method can be easily adapted to new problems.

artificial intelligence, eigenvector, optimization problem, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany > Saarland (0.14)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Getting lost in space: Large sample analysis of the resistance distance

Luxburg, Ulrike V., Radl, Agnes, Hein, Matthias

Neural Information Processing SystemsDec-31-2010

The commute distance between two vertices in a graph is the expected time it takes a random walk to travel from the first to the second vertex and back. We study the behavior of the commute distance as the size of the underlying graph increases. We prove that the commute distance converges to an expression that does not take into account the structure of the graph at all and that is completely meaningless as a distance function on the graph. Consequently, the use of the raw commute distance for machine learning purposes is strongly discouraged for large graphs and in high dimensions. As an alternative we introduce the amplified commute distance that corrects for the undesired large sample effects.

artificial intelligence, commute distance, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Saarland (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

Hein, Matthias, Bühler, Thomas

arXiv.org Machine LearningDec-3-2010

Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors amounts to finding critical points of a quadratic function subject to quadratic constraints. In this paper we show that a certain class of constrained optimization problems with nonquadratic objective and constraints can be understood as nonlinear eigenproblems. We derive a generalization of the inverse power method which is guaranteed to converge to a nonlinear eigenvector. We apply the inverse power method to 1-spectral clustering and sparse PCA which can naturally be formulated as nonlinear eigenproblems. In both applications we achieve state-of-the-art results in terms of solution quality and runtime. Moving beyond the standard eigenproblem should be useful also in many other applications and our inverse power method can be easily adapted to new problems.

artificial intelligence, eigenvector, optimization problem, (19 more...)

arXiv.org Machine Learning

1012.0774

Country: Europe > Germany > Saarland (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Influence of graph construction on graph-based clustering measures

Maier, Markus, Luxburg, Ulrike V., Hein, Matthias

Neural Information Processing SystemsDec-31-2009

Graph clustering methods such as spectral clustering are defined for general weighted graphs. In machine learning, however, data often is not given in form of a graph, but in terms of similarity (or distance) values between points. In this case, first a neighborhood graph is constructed using the similarities between the points and then a graph clustering algorithm is applied to this graph. In this paper we investigate the influence of the construction of the similarity graph on the clustering results. We first study the convergence of graph clustering criteria such as the normalized cut (Ncut) as the sample size tends to infinity. We find that the limit expressions are different for different types of graph, for example the r-neighborhood graph or the k-nearest neighbor graph. In plain words: Ncut on a kNN graph does something systematically different than Ncut on an r-neighborhood graph! This finding shows that graph clustering criteria cannot be studied independently of the kind of graph they are applied to. We also provide examples which show that these differences can be observed for toy and real data already for rather small sample sizes.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany > Saarland (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Semi-supervised Regression using Hessian energy with an application to semi-supervised dimensionality reduction

Kim, Kwang I., Steinke, Florian, Hein, Matthias

Neural Information Processing SystemsDec-31-2009

Semi-supervised regression based on the graph Laplacian suffers from the fact that the solution is biased towards a constant and the lack of extrapolating power. Outgoing from these observations we propose to use the second-order Hessian energy for semi-supervised regression which overcomes both of these problems, in particular, if the data lies on or close to a low-dimensional submanifold in the feature space, the Hessian energy prefers functions which vary ``linearly with respect to the natural parameters in the data. This property makes it also particularly suited for the task of semi-supervised dimensionality reduction where the goal is to find the natural parameters in the data based on a few labeled points. The experimental result suggest that our method is superior to semi-supervised regression using Laplacian regularization and standard supervised methods and is particularly suited for semi-supervised dimensionality reduction.

artificial intelligence, machine learning, regression, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany (0.28)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.81)

Add feedback

Influence of graph construction on graph-based clustering measures

Maier, Markus, Luxburg, Ulrike V., Hein, Matthias

Neural Information Processing SystemsDec-31-2009

Graph clustering methods such as spectral clustering are defined for general weighted graphs. In machine learning, however, data often is not given in form of a graph, but in terms of similarity (or distance) values between points. In this case, first a neighborhood graph is constructed using the similarities between the points and then a graph clustering algorithm is applied to this graph.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany > Saarland (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Robust Nonparametric Regression with Metric-Space Valued Output

Hein, Matthias

Neural Information Processing SystemsDec-31-2009

Motivated by recent developments in manifold-valued regression we propose a family of nonparametric kernel-smoothing estimators with metric-space valued output including a robust median type estimator and the classical Frechet mean. Depending on the choice of the output space and the chosen metric the estimator reduces to partially well-known procedures for multi-class classification, multivariate regression in Euclidean space, regression with manifold-valued output and even some cases of structured output learning. In this paper we focus on the case of regression with manifold-valued input and output. We show pointwise and Bayes consistency for all estimators in the family for the case of manifold-valued output and illustrate the robustness properties of the estimator with experiments.

artificial intelligence, estimator, survey article, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.29)
Europe > Germany > Saarland (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Filters

Collaborating Authors

Hein, Matthias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Constrained fractional set programs and their application in local clustering and community detection

Beyond Spectral Clustering - Tight Relaxations of Balanced Graph Cuts

Sparse recovery by thresholded non-negative least squares

An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

Getting lost in space: Large sample analysis of the resistance distance

An Inverse Power Method for Nonlinear Eigenproblems with Applications in 1-Spectral Clustering and Sparse PCA

Influence of graph construction on graph-based clustering measures

Semi-supervised Regression using Hessian energy with an application to semi-supervised dimensionality reduction

Influence of graph construction on graph-based clustering measures

Robust Nonparametric Regression with Metric-Space Valued Output