AITopics | Jordan, Michael

Collaborating Authors

Jordan, Michael

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Saturating Splines and Feature Selection

Boyd, Nicholas, Hastie, Trevor, Boyd, Stephen, Recht, Benjamin, Jordan, Michael

arXiv.org Machine LearningDec-4-2017

We extend the adaptive regression spline model by incorporating saturation, the natural requirement that a function extend as a constant outside a certain range. We fit saturating splines to data using a convex optimization problem over a space of measures, which we solve using an efficient algorithm based on the conditional gradient method. Unlike many existing approaches, our algorithm solves the original infinite-dimensional (for splines of degree at least two) optimization problem without pre-specified knot locations. We then adapt our algorithm to fit generalized additive models with saturating splines as coordinate functions and show that the saturation requirement allows our model to simultaneously perform feature selection and nonlinear function fitting. Finally, we briefly sketch how the method can be extended to higher order splines and to different requirements on the extension outside the data range.

artificial intelligence, optimization problem, spline, (16 more...)

arXiv.org Machine Learning

1609.06764

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

A deep generative model for single-cell RNA sequencing with application to detecting differentially expressed genes

Lopez, Romain, Regier, Jeffrey, Cole, Michael, Jordan, Michael, Yosef, Nir

arXiv.org Machine LearningOct-16-2017

We propose a probabilistic model for interpreting gene expression levels that are observed through single-cell RNA sequencing. In the model, each cell has a low-dimensional latent representation. Additional latent variables account for technical effects that may erroneously set some observations of gene expression levels to zero. Conditional distributions are specified by neural networks, giving the proposed model enough flexibility to fit the data well. We use variational inference and stochastic optimization to approximate the posterior distribution. The inference procedure scales to over one million cells, whereas competing algorithms do not. Even for smaller datasets, for several tasks, the proposed procedure outperforms state-of-the-art methods like ZIFA and ZINB-WaVE. We also extend our framework to take into account batch effects and other confounding factors and propose a natural Bayesian hypothesis framework for differential expression that outperforms tradition DESeq2.

application, deep generative model, single-cell rna

arXiv.org Machine Learning

1710.05086

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences

Jin, Chi, Zhang, Yuchen, Balakrishnan, Sivaraman, Wainwright, Martin J., Jordan, Michael

arXiv.org Machine LearningSep-4-2016

We provide two fundamental results on the population (infinite-sample) likelihood function of Gaussian mixture models with $M \geq 3$ components. Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians. We prove that the log-likelihood value of these bad local maxima can be arbitrarily worse than that of any global optimum, thereby resolving an open question of Srebro (2007). Our second main result shows that the EM algorithm (or a first-order variant of it) with random initialization will converge to bad critical points with probability at least $1-e^{-\Omega(M)}$. We further establish that a first-order variant of EM will not converge to strict saddle points almost surely, indicating that the poor performance of the first-order method can be attributed to the existence of bad local maxima rather than bad saddle points. Overall, our results highlight the necessity of careful initialization when using the EM algorithm in practice, even when applied in highly favorable settings.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1609.00978

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.74)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Fast robustness quantification with variational Bayes

Giordano, Ryan, Broderick, Tamara, Meager, Rachael, Huggins, Jonathan, Jordan, Michael

arXiv.org Machine LearningJun-22-2016

Bayesian hierarchical models are increasing popular in economics. When using hierarchical models, it is useful not only to calculate posterior expectations, but also to measure the robustness of these expectations to reasonable alternative prior choices. We use variational Bayes and linear response methods to provide fast, accurate posterior means and robustness measures with an application to measuring the effectiveness of microcredit in the developing world.

artificial intelligence, machine learning, robustness, (15 more...)

arXiv.org Machine Learning

1606.07153

Country: North America > United States > New York (0.14)

Genre:

Research Report > Experimental Study (0.96)
Research Report > Strength High (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes

Giordano, Ryan, Broderick, Tamara, Jordan, Michael

arXiv.org Machine LearningDec-23-2015

Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance. We generalize linear response methods from statistical physics to deliver accurate uncertainty estimates for model variables---both for individual variables and coherently across variables. We call our method linear response variational Bayes (LRVB). When the MFVB posterior approximation is in the exponential family, LRVB has a simple, analytic form, even for non-conjugate models. Indeed, we make no assumptions about the form of the true posterior. We demonstrate the accuracy and scalability of our method on a range of models for both simulated and real data.

bayesian inference, covariance, optimization problem, (19 more...)

arXiv.org Machine Learning

1506.04088

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Nonparametric Link Prediction in Large Scale Dynamic Networks

Sarkar, Purnamrita, Chakrabarti, Deepayan, Jordan, Michael

arXiv.org Machine LearningNov-16-2013

We propose a nonparametric approach to link prediction in large-scale dynamic networks. Our model uses graph-based features of pairs of nodes as well as those of their local neighborhoods to predict whether those nodes will be linked at each time step. The model allows for different types of evolution in different parts of the graph (e.g, growing or shrinking communities). We focus on large-scale graphs and present an implementation of our model that makes use of locality-sensitive hashing to allow it to be scaled to large problems. Experiments with simulated data as well as five real-world dynamic graphs show that we outperform the state of the art, especially when sharp fluctuations or nonlinearities are present. We also establish theoretical properties of our estimator, in particular consistency and weak convergence, the latter making use of an elaboration of Stein's method for dependency graphs.

artificial intelligence, graph, social media, (22 more...)

arXiv.org Machine Learning

1109.1077

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Nonparametric Link Prediction in Dynamic Networks

Sarkar, Purnamrita, Chakrabarti, Deepayan, Jordan, Michael

arXiv.org Machine LearningJun-27-2012

We propose a non-parametric link prediction algorithm for a sequence of graph snapshots over time. The model predicts links based on the features of its endpoints, as well as those of the local neighborhood around the endpoints. This allows for different types of neighborhoods in a graph, each with its own dynamics (e.g, growing or shrinking communities). We prove the consistency of our estimator, and give a fast implementation based on locality-sensitive hashing. Experiments with simulated as well as five real-world dynamic graphs show that we outperform the state of the art, especially when sharp fluctuations or non-linearities are present.

artificial intelligence, data mining, datacube, (15 more...)

arXiv.org Machine Learning

1206.6394

Country:

North America > United States > California (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

The Big Data Bootstrap

Kleiner, Ariel, Talwalkar, Ameet, Sarkar, Purnamrita, Jordan, Michael

arXiv.org Machine LearningJun-27-2012

The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets, the computation of bootstrap-based quantities can be prohibitively demanding. As an alternative, we present the Bag of Little Bootstraps (BLB), a new procedure which incorporates features of both the bootstrap and subsampling to obtain a robust, computationally efficient means of assessing estimator quality. BLB is well suited to modern parallel and distributed computing architectures and retains the generic applicability, statistical efficiency, and favorable theoretical properties of the bootstrap. We provide the results of an extensive empirical and theoretical investigation of BLB's behavior, including a study of its statistical correctness, its large-scale implementation and performance, selection of hyperparameters, and performance on real data.

artificial intelligence, big data, bootstrap, (19 more...)

arXiv.org Machine Learning

1206.6415

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.66)

Technology:

Information Technology > Architecture > Distributed Systems (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Variational Bayesian Inference with Stochastic Search

Paisley, John, Blei, David, Jordan, Michael

arXiv.org Machine LearningJun-27-2012

Mean-field variational inference is a method for approximate Bayesian posterior inference. It approximates a full posterior distribution with a factorized set of distributions by maximizing a lower bound on the marginal likelihood. This requires the ability to integrate a sum of terms in the log joint likelihood using this factorized distribution. Often not all integrals are in closed form, which is typically handled by using a lower bound. We present an alternative algorithm based on stochastic optimization that allows for direct optimization of the variational lower bound. This method uses control variates to reduce the variance of the stochastic search gradient, in which existing lower bounds can play an important role. We demonstrate the approach on two non-conjugate models: logistic regression and an approximation to the HDP.

bayesian inference, control variate, optimization problem, (15 more...)

arXiv.org Machine Learning

1206.643

Country: Europe > United Kingdom > Scotland (0.14)

Genre:

Research Report > New Finding (0.50)
Research Report > Experimental Study (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)

Add feedback

An Analysis of the Convergence of Graph Laplacians

Ting, Daniel, Huang, Ling, Jordan, Michael

arXiv.org Machine LearningJan-27-2011

Existing approaches to analyzing the asymptotics of graph Laplacians typically assume a well-behaved kernel function with smoothness assumptions. We remove the smoothness assumption and generalize the analysis of graph Laplacians to include previously unstudied graphs including kNN graphs. We also introduce a kernel-free framework to analyze graph constructions with shrinking neighborhoods in general and apply it to analyze locally linear embedding (LLE). We also describe how for a given limiting Laplacian operator desirable properties such as a convergent spectrum and sparseness can be achieved choosing the appropriate graph construction.

artificial intelligence, machine learning, operator, (18 more...)

arXiv.org Machine Learning

1101.5435

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.34)

Add feedback