AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Splitting Methods for Convex Clustering

arXiv.org Machine LearningMar-18-2014

Clustering is a fundamental problem in many scientific applications. Standard methods such as $k$-means, Gaussian mixture models, and hierarchical clustering, however, are beset by local minima, which are sometimes drastically suboptimal. Recently introduced convex relaxations of $k$-means and hierarchical clustering shrink cluster centroids toward one another and ensure a unique global minimizer. In this work we present two splitting methods for solving the convex clustering problem. The first is an instance of the alternating direction method of multipliers (ADMM); the second is an instance of the alternating minimization algorithm (AMA). In contrast to previously considered algorithms, our ADMM and AMA formulations provide simple and unified frameworks for solving the convex clustering problem under the previously studied norms and open the door to potentially novel norms. We demonstrate the performance of our algorithm on both simulated and real data examples. While the differences between the two algorithms appear to be minor on the surface, complexity analysis and numerical experiments show AMA to be significantly more efficient.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1080/10618600.2014.948181

1304.0499

Country:

Europe (0.67)
North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Structured Sparse Method for Hyperspectral Unmixing

Zhu, Feiyun, Wang, Ying, Xiang, Shiming, Fan, Bin, Pan, Chunhong

arXiv.org Artificial IntelligenceMar-18-2014

Hyperspectral Unmixing (HU) has received increasing attention in the past decades due to its ability of unveiling information latent in hyperspectral data. Unfortunately, most existing methods fail to take advantage of the spatial information in data. To overcome this limitation, we propose a Structured Sparse regularized Nonnegative Matrix Factorization (SS-NMF) method from the following two aspects. First, we incorporate a graph Laplacian to encode the manifold structures embedded in the hyperspectral data space. In this way, the highly similar neighboring pixels can be grouped together. Second, the lasso penalty is employed in SS-NMF for the fact that pixels in the same manifold structure are sparsely mixed by a common set of relevant bases. These two factors act as a new structured sparse constraint. With this constraint, our method can learn a compact space, where highly similar pixels are grouped to share correlated sparse representations. Experiments on real hyperspectral data sets with different noise levels demonstrate that our method outperforms the state-of-the-art methods significantly.

artificial intelligence, machine learning, pixel, (15 more...)

arXiv.org Artificial Intelligence

1403.4682

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Proximal Newton-type methods for minimizing composite functions

Lee, Jason D., Sun, Yuekai, Saunders, Michael A.

arXiv.org Machine LearningMar-17-2014

We generalize Newton-type methods for minimizing smooth functions to handle a sum of two convex functions: a smooth function and a nonsmooth function with a simple proximal mapping. We show that the resulting proximal Newton-type methods inherit the desirable convergence behavior of Newton-type methods for minimizing smooth functions, even when search directions are computed inexactly. Many popular methods tailored to problems arising in bioinformatics, signal processing, and statistical learning are special cases of proximal Newton-type methods, and our analysis yields new convergence results for some of these methods.

artificial intelligence, machine learning, newton-type method, (16 more...)

arXiv.org Machine Learning

1206.1623

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Active Discovery of Network Roles for Predicting the Classes of Network Nodes

Peel, Leto

arXiv.org Machine LearningMar-17-2014

Nodes in real world networks often have class labels, or underlying attributes, that are related to the way in which they connect to other nodes. Sometimes this relationship is simple, for instance nodes of the same class are may be more likely to be connected. In other cases, however, this is not true, and the way that nodes link in a network exhibits a different, more complex relationship to their attributes. Here, we consider networks in which we know how the nodes are connected, but we do not know the class labels of the nodes or how class labels relate to the network links. We wish to identify the best subset of nodes to label in order to learn this relationship between node attributes and network links. We can then use this discovered relationship to accurately predict the class labels of the rest of the network nodes. We present a model that identifies groups of nodes with similar link patterns, which we call network roles, using a generative blockmodel. The model then predicts labels by learning the mapping from network roles to class labels using a maximum margin classifier. We choose a subset of nodes to label according to an iterative margin-based active learning strategy. By integrating the discovery of network roles with the classifier optimisation, the active learning process can adapt the network roles to better represent the network for node classification. We demonstrate the model by exploring a selection of real world networks, including a marine food web and a network of English words. We show that, in contrast to other network classifiers, this model achieves good classification accuracy for a range of networks with different relationships between class labels and network links.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Machine Learning

1312.7258

Country:

Europe (0.67)
North America > United States > Colorado (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Geometric Algorithm for Scalable Multiple Kernel Learning

Moeller, John, Raman, Parasaran, Saha, Avishek, Venkatasubramanian, Suresh

arXiv.org Machine LearningMar-15-2014

We present a geometric formulation of the Multiple Kernel Learning (MKL) problem. To do so, we reinterpret the problem of learning kernel weights as searching for a kernel that maximizes the minimum (kernel) distance between two convex polytopes. This interpretation combined with novel structural insights from our geometric formulation allows us to reduce the MKL problem to a simple optimization routine that yields provable convergence as well as quality guarantees. As a result our method scales efficiently to much larger data sets than most prior methods can handle. Empirical evaluation on eleven datasets shows that we are significantly faster and even compare favorably with a uniform unweighted combination of kernels.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Machine Learning

1206.558

Country:

North America > United States (1.00)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

The Potential Benefits of Filtering Versus Hyper-Parameter Optimization

Smith, Michael R., Martinez, Tony, Giraud-Carrier, Christophe

arXiv.org Machine LearningMar-13-2014

The quality of an induced model by a learning algorithm is dependent on the quality of the training data and the hyper-parameters supplied to the learning algorithm. Prior work has shown that improving the quality of the training data (i.e., by removing low quality instances) or tuning the learning algorithm hyper-parameters can significantly improve the quality of an induced model. A comparison of the two methods is lacking though. In this paper, we estimate and compare the potential benefits of filtering and hyper-parameter optimization. Estimating the potential benefit gives an overly optimistic estimate but also empirically demonstrates an approximation of the maximum potential benefit of each method. We find that, while both significantly improve the induced model, improving the quality of the training set has a greater potential effect than hyper-parameter optimization.

accuracy, algorithm, hyper-parameter optimization, (16 more...)

arXiv.org Machine Learning

1403.3342

Country: North America > United States > Utah > Utah County > Provo (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Neighborhood Selection for Thresholding-based Subspace Clustering

Heckel, Reinhard, Agustsson, Eirikur, Bölcskei, Helmut

arXiv.org Machine LearningMar-13-2014

Subspace clustering refers to the problem of clustering high-dimensional data points into a union of low-dimensional linear subspaces, where the number of subspaces, their dimensions and orientations are all unknown. In this paper, we propose a variation of the recently introduced thresholding-based subspace clustering (TSC) algorithm, which applies spectral clustering to an adjacency matrix constructed from the nearest neighbors of each data point with respect to the spherical distance measure. The new element resides in an individual and data-driven choice of the number of nearest neighbors. Previous performance results for TSC, as well as for other subspace clustering algorithms based on spectral clustering, come in terms of an intermediate performance measure, which does not address the clustering error directly. Our main analytical contribution is a performance analysis of the modified TSC algorithm (as well as the original TSC algorithm) in terms of the clustering error directly.

artificial intelligence, machine learning, subspace, (15 more...)

arXiv.org Machine Learning

1403.3438

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

A survey of dimensionality reduction techniques

Sorzano, C. O. S., Vargas, J., Montano, A. Pascual

arXiv.org Machine LearningMar-12-2014

Experimental life sciences like biology or chemistry have seen in the recent decades an explosion of the data available from experiments. Laboratory instruments become more and more complex and report hundreds or thousands measurements for a single experiment and therefore the statistical methods face challenging tasks when dealing with such high dimensional data. However, much of the data is highly redundant and can be efficiently brought down to a much smaller number of variables without a significant loss of information. The mathematical procedures making possible this reduction are called dimensionality reduction techniques; they have widely been developed by fields like Statistics or Machine Learning, and are currently a hot research topic. In this review we categorize the plethora of dimension reduction techniques available and give the mathematical insight behind them.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1403.2877

Country: North America > United States (0.67)

Genre: Overview (0.84)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)

Add feedback

The Gaussian Radon Transform and Machine Learning

Holmes, Irina, Sengupta, Ambar

arXiv.org Machine LearningMar-12-2014

There has been growing recent interest in probabilistic interpretations of kernel-based methods as well as learning in Banach spaces. The absence of a useful Lebesgue measure on an infinite-dimensional reproducing kernel Hilbert space is a serious obstacle for such stochastic models. We propose an estimation model for the ridge regression problem within the framework of abstract Wiener spaces and show how the support vector machine solution to such problems can be interpreted in terms of the Gaussian Radon transform.

artificial intelligence, hilbert space, machine learning, (12 more...)

arXiv.org Machine Learning

1310.4794

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Learning Transformations for Clustering and Classification

Qiu, Qiang, Sapiro, Guillermo

arXiv.org Machine LearningMar-9-2014

A low-rank transformation learning framework for subspace clustering and classification is here proposed. Many high-dimensional data, such as face images and motion sequences, approximately lie in a union of low-dimensional subspaces. The corresponding subspace clustering problem has been extensively studied in the literature to partition such high-dimensional data into clusters corresponding to their underlying low-dimensional subspaces. However, low-dimensional intrinsic structures are often violated for real-world observations, as they can be corrupted by errors or deviate from ideal models. We propose to address this by learning a linear transformation on subspaces using matrix rank, via its convex surrogate nuclear norm, as the optimization criteria. The learned linear transformation restores a low-rank structure for data from the same subspace, and, at the same time, forces a a maximally separated structure for data from different subspaces. In this way, we reduce variations within subspaces, and increase separation between subspaces for a more robust subspace clustering. This proposed learned robust subspace clustering framework significantly enhances the performance of existing subspace clustering methods. Basic theoretical results here presented help to further support the underlying framework. To exploit the low-rank structures of the transformed subspaces, we further introduce a fast subspace clustering technique, which efficiently combines robust PCA with sparse modeling. When class labels are present at the training stage, we show this low-rank transformation framework also significantly enhances classification performance. Extensive experiments using public datasets are presented, showing that the proposed approach significantly outperforms state-of-the-art methods for subspace clustering and classification.

artificial intelligence, machine learning, subspace, (17 more...)

arXiv.org Machine Learning

1309.2074

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback