AITopics | Ravikumar, Pradeep

Collaborating Authors

Ravikumar, Pradeep

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Hierarchical Graphical Model for Record Linkage

Ravikumar, Pradeep, Cohen, William

arXiv.org Machine LearningJul-12-2012

The task of matching co-referent records is known among other names as rocord linkage. For large record-linkage problems, often there is little or no labeled data available, but unlabeled data shows a reasonable clear structure. For such problems, unsupervised or semi-supervised methods are preferable to supervised methods. In this paper, we describe a hierarchical graphical model framework for the linakge-problem in an unsupervised setting. In addition to proposing new methods, we also cast existing unsupervised probabilistic record-linkage methods in this framework. Some of the techniques we propose to minimize overfitting in the above model are of interest in the general graphical model setting. We describe a method for incorporating monotinicity constraints in a graphical model. We also outline a bootstrapping approach of using "single-field" classifiers to noisily label latent variables in a hierarchical model. Experimental results show that our proposed unsupervised methods perform quite competitively even with fully supervised record-linkage methods.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Machine Learning

1207.418

Country:

North America > United States > Texas (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Variational Chernoff Bounds for Graphical Models

Ravikumar, Pradeep, Lafferty, John

arXiv.org Machine LearningJul-11-2012

Recent research has made significant progress on the problem of bounding log partition functions for exponential family graphical models. Such bounds have associated dual parameters that are often used as heuristic estimates of the marginal probabilities required in inference and learning. However these variational estimates do not give rigorous bounds on marginal probabilities, nor do they give estimates for probabilities of more general events than simple marginals. In this paper we build on this recent work by deriving rigorous upper and lower bounds on event probabilities for graphical models. Our approach is based on the use of generalized Chernoff bounds to express bounds on event probabilities in terms of convex optimization problems; these optimization problems, in turn, require estimates of generalized log partition functions. Simulations indicate that this technique can result in useful, rigorous bounds to complement the heuristic variational estimates, with comparable computational cost.

chernoff, optimization problem, survey article, (16 more...)

arXiv.org Machine Learning

1207.4172

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

High-dimensional Sparse Inverse Covariance Estimation using Greedy Methods

Johnson, Christopher C., Jalali, Ali, Ravikumar, Pradeep

arXiv.org Machine LearningDec-29-2011

In this paper we consider the task of estimating the non-zero pattern of the sparse inverse covariance matrix of a zero-mean Gaussian random vector from a set of iid samples. Note that this is also equivalent to recovering the underlying graph structure of a sparse Gaussian Markov Random Field (GMRF). We present two novel greedy approaches to solving this problem. The first estimates the non-zero covariates of the overall inverse covariance matrix using a series of global forward and backward greedy steps. The second estimates the neighborhood of each node in the graph separately, again using greedy forward and backward steps, and combines the intermediate neighborhoods to form an overall estimate. The principal contribution of this paper is a rigorous analysis of the sparsistency, or consistency in recovering the sparsity pattern of the inverse covariance matrix. Surprisingly, we show that both the local and global greedy methods learn the full structure of the model with high probability given just $O(d\log(p))$ samples, which is a \emph{significant} improvement over state of the art $\ell_1$-regularized Gaussian MLE (Graphical Lasso) that requires $O(d^2\log(p))$ samples. Moreover, the restricted eigenvalue and smoothness conditions imposed by our greedy methods are much weaker than the strong irrepresentable conditions required by the $\ell_1$-regularization based methods. We corroborate our results with extensive simulations and examples, comparing our local and global greedy methods to the $\ell_1$-regularized Gaussian MLE as well as the Neighborhood Greedy method to that of nodewise $\ell_1$-regularized linear regression (Neighborhood Lasso).

artificial intelligence, greedy algorithm, machine learning, (12 more...)

arXiv.org Machine Learning

1112.6411

Country: Europe (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization

Agarwal, Alekh, Bartlett, Peter L., Ravikumar, Pradeep, Wainwright, Martin J.

arXiv.org Machine LearningNov-20-2011

Relative to the large literature on upper bounds on complexity of convex optimization, lesser attention has been paid to the fundamental hardness of these problems. Given the extensive use of convex optimization in machine learning and statistics, gaining an understanding of these complexity-theoretic issues is important. In this paper, we study the complexity of stochastic convex optimization in an oracle model of computation. We improve upon known results and obtain tight minimax complexity estimates for various function classes.

artificial intelligence, optimization, optimization problem, (16 more...)

arXiv.org Machine Learning

1009.0571

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

On Learning Discrete Graphical Models Using Greedy Methods

Jalali, Ali, Johnson, Chris, Ravikumar, Pradeep

arXiv.org Machine LearningJul-16-2011

In this paper, we address the problem of learning the structure of a pairwise graphical model from samples in a high-dimensional setting. Our first main result studies the sparsistency, or consistency in sparsity pattern recovery, properties of a forward-backward greedy algorithm as applied to general statistical models. As a special case, we then apply this algorithm to learn the structure of a discrete graphical model via neighborhood estimation. As a corollary of our general result, we derive sufficient conditions on the number of samples n, the maximum node-degree d and the problem size p, as well as other conditions on the model parameters, so that the algorithm recovers all the edges with high probability. Our result guarantees graph selection for samples scaling as n = Omega(d^2 log(p)), in contrast to existing convex-optimization based algorithms that require a sample complexity of \Omega(d^3 log(p)). Further, the greedy algorithm only requires a restricted strong convexity condition which is typically milder than irrepresentability assumptions. We corroborate these results using numerical simulations at the end.

artificial intelligence, greedy algorithm, machine learning, (17 more...)

arXiv.org Machine Learning

1107.3258

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

A Dirty Model for Multiple Sparse Regression

Jalali, Ali, Ravikumar, Pradeep, Sanghavi, Sujay

arXiv.org Machine LearningJun-28-2011

Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \ell_1/\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theoretically and empirically, our method strictly and noticeably outperforms both \ell_1 or \ell_1/\ell_q methods, over the entire range of possible overlaps (except at boundary cases, where we match the best method). We also provide theoretical guarantees that the method performs well under high-dimensional scaling.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Machine Learning

1106.5826

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Error-Correcting Tournaments

Beygelzimer, Alina, Langford, John, Ravikumar, Pradeep

arXiv.org Artificial IntelligenceFeb-3-2010

We present a family of pairwise tournaments reducing $k$-class classification to binary classification. These reductions are provably robust against a constant fraction of binary errors. The results improve on the PECOC construction \cite{SECOC} with an exponential improvement in computation, from $O(k)$ to $O(\log_2 k)$, and the removal of a square root in the regret dependence, matching the best possible computation and regret up to a constant.

artificial intelligence, machine learning, tournament, (17 more...)

arXiv.org Artificial Intelligence

0902.3176

Country: North America > United States > Texas > Travis County > Austin (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

High-dimensional covariance estimation by minimizing $\ell_1$-penalized log-determinant divergence

Ravikumar, Pradeep, Wainwright, Martin J., Raskutti, Garvesh, Yu, Bin

arXiv.org Machine LearningNov-21-2008

Given i.i.d. observations of a random vector $X \in \mathbb{R}^p$, we study the problem of estimating both its covariance matrix $\Sigma^*$, and its inverse covariance or concentration matrix {$\Theta^* = (\Sigma^*)^{-1}$.} We estimate $\Theta^*$ by minimizing an $\ell_1$-penalized log-determinant Bregman divergence; in the multivariate Gaussian case, this approach corresponds to $\ell_1$-penalized maximum likelihood, and the structure of $\Theta^*$ is specified by the graph of an associated Gaussian Markov random field. We analyze the performance of this estimator under high-dimensional scaling, in which the number of nodes in the graph $p$, the number of edges $s$ and the maximum node degree $d$, are allowed to grow as a function of the sample size $n$. In addition to the parameters $(p,s,d)$, our analysis identifies other key quantities covariance matrix $\Sigma^*$; and (b) the $\ell_\infty$ operator norm of the sub-matrix $\Gamma^*_{S S}$, where $S$ indexes the graph edges, and $\Gamma^* = (\Theta^*)^{-1} \otimes (\Theta^*)^{-1}$; and (c) a mutual incoherence or irrepresentability measure on the matrix $\Gamma^*$ and (d) the rate of decay $1/f(n,\delta)$ on the probabilities $ \{|\hat{\Sigma}^n_{ij}- \Sigma^*_{ij}| > \delta \}$, where $\hat{\Sigma}^n$ is the sample covariance based on $n$ samples. Our first result establishes consistency of our estimate $\hat{\Theta}$ in the elementwise maximum-norm. This in turn allows us to derive convergence rates in Frobenius and spectral norms, with improvements upon existing results for graphs with maximum node degrees $d = o(\sqrt{s})$. In our second result, we show that with probability converging to one, the estimate $\hat{\Theta}$ correctly specifies the zero pattern of the concentration matrix $\Theta^*$.

artificial intelligence, machine learning, matrix, (19 more...)

arXiv.org Machine Learning

0811.3628

Country:

Europe (0.67)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback