AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Learning Stable Multilevel Dictionaries for Sparse Representations

Thiagarajan, Jayaraman J., Ramamurthy, Karthikeyan Natesan, Spanias, Andreas

arXiv.org Machine LearningSep-25-2013

Sparse representations using learned dictionaries are being increasingly used with success in several data processing and machine learning applications. The availability of abundant training data necessitates the development of efficient, robust and provably good dictionary learning algorithms. Algorithmic stability and generalization are desirable characteristics for dictionary learning algorithms that aim to build global dictionaries which can efficiently model any test data similar to the training samples. In this paper, we propose an algorithm to learn dictionaries for sparse representations from large scale data, and prove that the proposed learning algorithm is stable and generalizable asymptotically. The algorithm employs a 1-D subspace clustering procedure, the K-hyperline clustering, in order to learn a hierarchical dictionary with multiple levels. We also propose an information-theoretic scheme to estimate the number of atoms needed in each level of learning and develop an ensemble approach to learn robust dictionaries. Using the proposed dictionaries, the sparse code for novel test data can be computed using a low-complexity pursuit procedure. We demonstrate the stability and generalization characteristics of the proposed algorithm using simulations. We also evaluate the utility of the multilevel dictionaries in compressed recovery and subspace learning applications.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

1303.0448

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Random Forests on Distance Matrices for Imaging Genetics Studies

Sim, Aaron, Tsagkrasoulis, Dimosthenis, Montana, Giovanni

arXiv.org Machine LearningSep-24-2013

The clinical pathology of neurological diseases and the imaging of the human brain are two areas of research that have largely developed along independent lines. It is only in the past few years that the usefulness of noninvasive imaging measurements of the human brain to the diagnosis and early prediction of neurological diseases been widely recognised (Albert et al., 2011; Sperling et al., 2011; Gray et al., 2013). In Alzheimer's Disease (AD), for instance, clinical guidance on the diagnosis of this most common of neurological degenerative disorders has recently been updated to incorporate neuroimaging markers alongside standard cognitive and behavioural tests (Albert et al., 2011; Sperling et al., 2011). The key to the improved characterisation of AD lies in the quantitative nature of the imaging measurements compared to the relatively subjective and imprecise nature of traditional clinical assessments. Imaging biomarkers of cerebral atrophy and of loss of connectivity between key regions in the brain are believed to be reliable indicators of AD and are particularly useful at early disease stages when standard cognitive assessments can be inconclusive. The utility of imaging phenotypes extends beyond diagnosis and prediction to the search for the underlying genetic factors behind neurological disorders (Stein et al., 2010). This comparatively more recent use of neuroimaging measurements in place of case-control labels in genetic association studies defines the emerging field of imaging genetics. The central premise here is that, should they exist, genetic associations to intermediate brain structure and brain function phenotypes are stronger than those with the categorical clinical disease statuses further down the etiological chain (Glahn et al., 2007). Again, the example of AD serves as a good illustration.

artificial intelligence, machine learning, matrix, (20 more...)

arXiv.org Machine Learning

1309.6158

Country: North America > United States > New York (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Smooth minimization of nonsmooth functions with parallel coordinate descent methods

Fercoq, Olivier, Richtárik, Peter

arXiv.org Machine LearningSep-23-2013

We study the performance of a family of randomized parallel coordinate descent methods for minimizing the sum of a nonsmooth and separable convex functions. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss ("AdaBoost problem"). We assume the input data defining the loss function is contained in a sparse $m\times n$ matrix $A$ with at most $\omega$ nonzeros in each row. Our methods need $O(n \beta/\tau)$ iterations to find an approximate solution with high probability, where $\tau$ is the number of processors and $\beta = 1 + (\omega-1)(\tau-1)/(n-1)$ for the fastest variant. The notation hides dependence on quantities such as the required accuracy and confidence levels and the distance of the starting iterate from an optimal point. Since $\beta/\tau$ is a decreasing function of $\tau$, the method needs fewer iterations when more processors are used. Certain variants of our algorithms perform on average only $O(\nnz(A)/n)$ arithmetic operations during a single iteration per processor and, because $\beta$ decreases when $\omega$ does, fewer iterations are needed for sparser problems.

artificial intelligence, descent method, machine learning, (19 more...)

arXiv.org Machine Learning

1309.5885

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Scalable Spectral Algorithms for Community Detection in Directed Networks

Kim, Sungmin, Shi, Tao

arXiv.org Machine LearningSep-23-2013

Many real world problems can be effectively modeled as pairwise relationship in networks where nodes represent entities of interest and links mimic the interactions or relationships between them. The study of networks, recently referred to as network science, can provide insight into their structures and properties. One particularly interesting problem in network studies is searching for important sub-networks which are called communities, modules or groups. A community in a network is typically characterized by a group of nodes that have more links connected within the community than connected to other nodes (Fortunato, 2010). In many practical applications, the networks in study are directed in nature, such as the World Wide Web, tweeter's follower-followee network, and citation networks. Compared with in-depth studies of community structures in undirected networks (Danon et al., 2005; Fortunato, 2010; Coscia, Giannotti and Pedreschi, 2011), community detection in directed networks has not been as fruitful.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1211.6807

Country: North America > United States (0.46)

Genre: Research Report (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Efficient Sampling from Time-Varying Log-Concave Distributions

Narayanan, Hariharan, Rakhlin, Alexander

arXiv.org Machine LearningSep-23-2013

We propose a computationally efficient random walk on a convex body which rapidly mixes and closely tracks a time-varying log-concave distribution. We develop general theoretical guarantees on the required number of steps; this number can be calculated on the fly according to the distance from and the shape of the next distribution. We then illustrate the technique on several examples. Within the context of exponential families, the proposed method produces samples from a posterior distribution which is updated as data arrive in a streaming fashion. The sampling technique can be used to track time-varying truncated distributions, as well as to obtain samples from a changing mixture model, fitted in a streaming fashion to data. In the setting of linear optimization, the proposed method has oracle complexity with best known dependence on the dimension for certain geometries. In the context of online learning and repeated games, the algorithm is an efficient method for implementing no-regret mixture forecasting strategies. Remarkably, in some of these examples, only one step of the random walk is needed to track the next distribution.

artificial intelligence, machine learning, markov chain, (19 more...)

arXiv.org Machine Learning

1309.5977

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming

Ghadimi, Saeed, Lan, Guanghui

arXiv.org Machine LearningSep-21-2013

In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and show that such modification allows to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

1309.5549

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Latent Fisher Discriminant Analysis

Chen, Gang

arXiv.org Machine LearningSep-20-2013

Linear Discriminant Analysis (LDA) is a well-known method for dimensionality reduction and classification. Previous studies have also extended the binary-class case into multi-classes. However, many applications, such as object detection and keyframe extraction cannot provide consistent instance-label pairs, while LDA requires labels on instance level for training. Thus it cannot be directly applied for semi-supervised classification problem. In this paper, we overcome this limitation and propose a latent variable Fisher discriminant analysis model. We relax the instance-level labeling into bag-level, is a kind of semi-supervised (video-level labels of event type are required for semantic frame extraction) and incorporates a data-driven prior over the latent variables. Hence, our method combines the latent variable inference and dimension reduction in an unified bayesian framework. We test our method on MUSK and Corel data sets and yield competitive results compared to the baseline approach. We also demonstrate its capacity on the challenging TRECVID MED11 dataset for semantic keyframe extraction and conduct a human-factors ranking-based experimental evaluation, which clearly demonstrates our proposed method consistently extracts more semantically meaningful keyframes than challenging baselines.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1309.5427

Country: North America > United States (0.47)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

mTim: Rapid and accurate transcript reconstruction from RNA-Seq data

Zeller, Georg, Goernitz, Nico, Kahles, Andre, Behr, Jonas, Mudrakarta, Pramod, Sonnenburg, Soeren, Raetsch, Gunnar

arXiv.org Machine LearningSep-20-2013

High-throughput sequencing technology applied to cellular mRNA (RNA-Seq) has revolutionized transcriptome studies [19, 17, 35, among many others]. In contrast to microarray platforms, which it has replaced in many applications, RNA-Seq can not only be used to accurately quantify known transcripts, but also to reveal the precise structure of transcripts at single-nucleotide resolution. RNA-Seq based transcript reconstruction has therefore become a valuable tool for the completion of genome annotations [22, 11, for instance] and further enabled subsequent analyses of differentially expressed genes [2], transcript isoforms [6, 4] and exons [3], all of which generally rely on correctly inferred transcript inventories. De novo transcript reconstruction is thus a pivotal step in the analysis of RNA-Seq data. There are two conceptually different strategies to approach this problem: one can either assemble transcripts directly from RNA-Seq reads using methodology that originated from genome assembly approaches [13, 23, 25].

artificial intelligence, bioinformatics, machine learning, (18 more...)

arXiv.org Machine Learning

1309.5211

Country: Europe > Germany (0.47)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Add feedback

A Comparative Analysis of Ensemble Classifiers: Case Studies in Genomics

Whalen, Sean, Pandey, Gaurav

arXiv.org Machine LearningSep-19-2013

The combination of multiple classifiers using ensemble methods is increasingly important for making progress in a variety of difficult prediction problems. We present a comparative analysis of several ensemble methods through two case studies in genomics, namely the prediction of genetic interactions and protein functions, to demonstrate their efficacy on real-world datasets and draw useful conclusions about their behavior. These methods include simple aggregation, meta-learning, cluster-based meta-learning, and ensemble selection using heterogeneous classifiers trained on resampled data to improve the diversity of their predictions. We present a detailed analysis of these methods across 4 genomics datasets and find the best of these methods offer statistically significant improvements over the state of the art in their respective domains. In addition, we establish a novel connection between ensemble selection and meta-learning, demonstrating how both of these disparate methods establish a balance between ensemble diversity and performance.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1309.5047

Country: North America > United States (0.67)

Genre: Research Report (0.65)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

HOL(y)Hammer: Online ATP Service for HOL Light

Kaliszyk, Cezary, Urban, Josef

arXiv.org Artificial IntelligenceSep-19-2013

HOL(y)Hammer is an online AI/ATP service for formal (computer-understandable) mathematics encoded in the HOL Light system. The service allows its users to upload and automatically process an arbitrary formal development (project) based on HOL Light, and to attack arbitrary conjectures that use the concepts defined in some of the uploaded projects. For that, the service uses several automated reasoning systems combined with several premise selection methods trained on all the project proofs. The projects that are readily available on the server for such query answering include the recent versions of the Flyspeck, Multivariate Analysis and Complex Analysis libraries. The service runs on a 48-CPU server, currently employing in parallel for each task 7 AI/ATP combinations and 4 decision procedures that contribute to its overall performance. The system is also available for local installation by interested users, who can customize it for their own proof development. An Emacs interface allowing parallel asynchronous queries to the service is also provided. The overall structure of the service is outlined, problems that arise and their solutions are discussed, and an initial account of using the system is given.

artificial intelligence, logic & formal reasoning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1309.4962

Country: Europe (0.67)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback