AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

Sharpnack, James, Krishnamurthy, Akshay, Singh, Aarti

arXiv.org Machine LearningDec-11-2013

The detection of anomalous activity in graphs is a statistical problem that arises in many applications, such as network surveillance, disease outbreak detection, and activity monitoring in social networks. Beyond its wide applicability, graph structured anomaly detection serves as a case study in the difficulty of balancing computational complexity with statistical power. In this work, we develop from first principles the generalized likelihood ratio test for determining if there is a well connected region of activation over the vertices in the graph in Gaussian noise. Because this test is computationally infeasible, we provide a relaxation, called the Lovasz extended scan statistic (LESS) that uses submodularity to approximate the intractable generalized likelihood ratio. We demonstrate a connection between LESS and maximum a-posteriori inference in Markov random fields, which provides us with a poly-time algorithm for LESS. Using electrical network theory, we are able to control type 1 error for LESS and prove conditions under which LESS is risk consistent. Finally, we consider specific graph models, the torus, k-nearest neighbor graphs, and epsilon-random graphs. We show that on these graphs our results provide near-optimal performance by matching our results to known lower bounds.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1312.3291

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Epidemiology (0.54)
Energy > Power Industry (0.34)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Nonparametric Estimation of Multi-View Latent Variable Models

Song, Le, Anandkumar, Animashree, Dai, Bo, Xie, Bo

arXiv.org Machine LearningDec-7-2013

Spectral methods have greatly advanced the estimation of latent variable models, generating a sequence of novel and efficient algorithms with strong theoretical guarantees. However, current spectral algorithms are largely restricted to mixtures of discrete or Gaussian distributions. In this paper, we propose a kernel method for learning multi-view latent variable models, allowing each mixture component to be nonparametric. The key idea of the method is to embed the joint distribution of a multi-view latent variable into a reproducing kernel Hilbert space, and then the latent parameters are recovered using a robust tensor power method. We establish that the sample complexity for the proposed method is quadratic in the number of latent components and is a low order polynomial in the other relevant parameters. Thus, our non-parametric tensor approach to learning latent variable models enjoys good sample and computational efficiencies. Moreover, the non-parametric tensor power method compares favorably to EM algorithm and other existing spectral algorithms in our experiments.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1311.3287

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

An Algorithmic Theory of Dependent Regularizers, Part 1: Submodular Structure

Koepke, Hoyt, Meila, Marina

arXiv.org Machine LearningDec-6-2013

We present an exploration of the rich theoretical connections between several classes of regularized models, network flows, and recent results in submodular function theory. This work unifies key aspects of these problems under a common theory, leading to novel methods for working with several important models of interest in statistics, machine learning and computer vision. In Part 1, we review the concepts of network flows and submodular function optimization theory foundational to our results. We then examine the connections between network flows and the minimum-norm algorithm from submodular optimization, extending and improving several current results. This leads to a concise representation of the structure of a large class of pairwise regularized models important in machine learning, statistics and computer vision. In Part 2, we describe the full regularization path of a class of penalized regression problems with dependent variables that includes the graph-guided LASSO and total variation constrained models. This description also motivates a practical algorithm. This allows us to efficiently find the regularization path of the discretized version of TV penalized models. Ultimately, our new algorithms scale up to high-dimensional problems with millions of variables.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1312.197

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.92)
(3 more...)

Add feedback

A Component Lasso

Hussami, Nadine, Tibshirani, Robert

arXiv.org Machine LearningDec-6-2013

We propose a new sparse regression method called the component lasso, based on a simple idea. The method uses the connected-components structure of the sample covariance matrix to split the problem into smaller ones. It then solves the subproblems separately, obtaining a coefficient vector for each one. Then, it uses non-negative least squares to recombine the different vectors into a single solution. This step is useful in selecting and reweighting components that are correlated with the response. Simulated and real data examples show that the component lasso can outperform standard regression methods such as the lasso and elastic net, achieving a lower mean squared error as well as better support recovery.

artificial intelligence, component lasso, machine learning, (15 more...)

arXiv.org Machine Learning

1311.4472

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Partitioning into Expanders

Gharan, Shayan Oveis, Trevisan, Luca

arXiv.org Machine LearningDec-6-2013

Let G=(V,E) be an undirected graph, lambda_k be the k-th smallest eigenvalue of the normalized laplacian matrix of G. There is a basic fact in algebraic graph theory that lambda_k > 0 if and only if G has at most k-1 connected components. We prove a robust version of this fact. If lambda_k>0, then for some 1\leq \ell\leq k-1, V can be {\em partitioned} into l sets P_1,\ldots,P_l such that each P_i is a low-conductance set in G and induces a high conductance induced subgraph. In particular, \phi(P_i)=O(l^3\sqrt{\lambda_l}) and \phi(G[P_i]) >= \lambda_k/k^2). We make our results algorithmic by designing a simple polynomial time spectral algorithm to find such partitioning of G with a quadratic loss in the inside conductance of P_i's. Unlike the recent results on higher order Cheeger's inequality [LOT12,LRTV12], our algorithmic results do not use higher order eigenfunctions of G. If there is a sufficiently large gap between lambda_k and lambda_{k+1}, more precisely, if \lambda_{k+1} >= \poly(k) lambda_{k}^{1/4} then our algorithm finds a k partitioning of V into sets P_1,...,P_k such that the induced subgraph G[P_i] has a significantly larger conductance than the conductance of P_i in G. Such a partitioning may represent the best k clustering of G. Our algorithm is a simple local search that only uses the Spectral Partitioning algorithm as a subroutine. We expect to see further applications of this simple algorithm in clustering applications.

artificial intelligence, conductance, machine learning, (16 more...)

arXiv.org Machine Learning

1309.3223

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Sparse Linear Dynamical System with Its Application in Multivariate Clinical Time Series

Liu, Zitao, Hauskrecht, Milos

arXiv.org Machine LearningDec-3-2013

Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning multivariate time series. However, in general, it is difficult to set the dimension of its hidden state space. A small number of hidden states may not be able to model the complexities of a time series, while a large number of hidden states can lead to overfitting. In this paper, we study methods that impose an $\ell_1$ regularization on the transition matrix of an LDS model to alleviate the problem of choosing the optimal number of hidden states. We incorporate a generalized gradient descent method into the Maximum a Posteriori (MAP) framework and use Expectation Maximization (EM) to iteratively achieve sparsity on the transition matrix of an LDS model. We show that our Sparse Linear Dynamical System (SLDS) improves the predictive performance when compared to ordinary LDS on a multivariate clinical time series dataset.

artificial intelligence, linear dynamical system, machine learning, (12 more...)

arXiv.org Machine Learning

1311.7071

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Families of Parsimonious Finite Mixtures of Regression Models

Dang, Utkarsh J., McNicholas, Paul D.

arXiv.org Machine LearningDec-2-2013

Model-based clustering has become increasingly popular during the last decade. Parametric mixture models are used in model-based clustering; however, such models generally do not exploit covariates. Incorporating a regression structure can yield important insight when there is a regression relationship between some variables. Methodologies that deal with such data include finite mixtures of regressions (FMR; [7, 13]) and finite mixtures of regressions with concomitant variables (FMRC; [22]), supported by the popular flexmix package [13]. Multivariate correlated responses can be naturally integrated into such models. However, flexmix currently does not account for correlated response variables for both FMR and FMRC. FMR models that deal with correlated response variables have recently been proposed [19, 9].

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1312.0518

Country:

North America > Canada > Ontario (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

A Junction Tree Framework for Undirected Graphical Model Selection

Vats, Divyanshu, Nowak, Robert

arXiv.org Machine LearningDec-2-2013

An undirected graphical model is a joint probability distribution defined on an undirected graph G*, where the vertices in the graph index a collection of random variables and the edges encode conditional independence relationships among random variables. The undirected graphical model selection (UGMS) problem is to estimate the graph G* given observations drawn from the undirected graphical model. This paper proposes a framework for decomposing the UGMS problem into multiple subproblems over clusters and subsets of the separators in a junction tree. The junction tree is constructed using a graph that contains a superset of the edges in G*. We highlight three main properties of using junction trees for UGMS. First, different regularization parameters or different UGMS algorithms can be used to learn different parts of the graph. This is possible since the subproblems we identify can be solved independently of each other. Second, under certain conditions, a junction tree based UGMS algorithm can produce consistent results with fewer observations than the usual requirements of existing algorithms. Third, both our theoretical and experimental results show that the junction tree framework does a significantly better job at finding the weakest edges in a graph than existing methods. This property is a consequence of both the first and second properties. Finally, we note that our framework is independent of the choice of the UGMS algorithm and can be used as a wrapper around standard UGMS algorithms for more accurate graph estimation.

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Machine Learning

1304.491

Country: North America > United States > Wisconsin (0.27)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Single Network Relational Transductive Learning

Dhurandhar, A., Wang, J.

Journal of Artificial Intelligence ResearchNov-30-2013

Relational classification on a single connected network has been of particular interest in the machine learning and data mining communities in the last decade or so. This is mainly due to the explosion in popularity of social networking sites such as Facebook, LinkedIn and Google+ amongst others. In statistical relational learning, many techniques have been developed to address this problem, where we have a connected unweighted homogeneous/heterogeneous graph that is partially labeled and the goal is to propagate the labels to the unlabeled nodes. In this paper, we provide a different perspective by enabling the effective use of graph transduction techniques for this problem. We thus exploit the strengths of this class of methods for relational learning problems. We accomplish this by providing a simple procedure for constructing a weight matrix that serves as input to a rich class of graph transduction techniques. Our procedure has multiple desirable properties. For example, the weights it assigns to edges between unlabeled nodes naturally relate to a measure of association commonly used in statistics, namely the Gamma test statistic. We further portray the efficacy of our approach on synthetic as well as real data, by comparing it with state-of-the-art relational learning algorithms, and graph transduction techniques with an adjacency matrix or a real valued weight matrix computed using available attributes as input. In these experiments we see that our approach consistently outperforms other approaches when the graph is sparsely labeled, and remains competitive with the best when the proportion of known labels increases.

graph, node, weight matrix, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4068

AI Access Foundation

10851

Journal of Artificial Intelligence Research

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.67)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

A Typology of Collaboration Platform Users

Bezzubtseva, Anastasia, Ignatov, Dmitry I.

arXiv.org Machine LearningNov-30-2013

In this paper we present a review of the existing typologies of Internet service users. We zoom in on social networking services including blogs and crowdsourcing websites. Based on the results of the analysis of the considered typologies obtained by means of FCA we developed a new user typology of a certain class of Internet services, namely a collaboration innovation platform. Cluster analysis of data extracted from the collaboration platform Witology was used to divide more than 500 participants into six groups based on three activity indicators: idea generation, commenting, and evaluation (assigning marks) The obtained groups and their percentages appear to follow the "90 - 9 - 1" rule.

artificial intelligence, machine learning, social media, (18 more...)

arXiv.org Machine Learning

1312.0162

Country:

Europe (0.95)
North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.72)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Add feedback