AITopics | Drton, Mathias

Generalized Score Matching for General Domains

Yu, Shiqing, Drton, Mathias, Shojaie, Ali

arXiv.org Machine LearningSep-23-2020

Estimation of density functions supported on general domains arises when the data is naturally restricted to a proper subset of the real space. This problem is complicated by typically intractable normalizing constants. Score matching provides a powerful tool for estimating densities with such intractable normalizing constants, but as originally proposed is limited to densities on $\mathbb{R}^m$ and $\mathbb{R}_+^m$. In this paper, we offer a natural generalization of score matching that accommodates densities supported on a very general class of domains. We apply the framework to truncated graphical and pairwise interaction models, and provide theoretical guarantees for the resulting estimators. We also generalize a recently proposed method from bounded to unbounded domains, and empirically demonstrate the advantages of our method.

health & medicine, oncology, truncation quantile truncation quantile, (16 more...)

arXiv.org Machine Learning

2009.11428

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.63)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Structure Learning for Cyclic Linear Causal Models

Améndola, Carlos, Dettling, Philipp, Drton, Mathias, Onori, Federica, Wu, Jun

arXiv.org Machine LearningAug-19-2020

Inferring the structure of a causal model with feedback loops from observational data is a notoriously difficult--if not impossible--problem, particularly if one also seeks to guard against presence of latent confounders [9, 29]. We consider this problem for linear causal models given by mixed graphs (or path diagrams) with directed and bidirected edges. As detailed in Section 2, the vertices of such a graph correspond to the observed variables, and the directed edges encode structural equations that relate these variables up to stochastic noise. The bidirected edges indicate possible correlations among the noise terms, as may be induced by latent confounders. Much work has gone into algorithms that exploit conditional independence relations for learning the structure of causal models, or rather suitable equivalence classes of graphs encoding this structure; see, e.g., [10, 17, 18, 24, 25] or also the review of Spirtes and Zhang in [21, §18]. While methods have been developed that use information about conditional independence relations also in settings with feedback loops or latent variables, there is an inherent limitation to this approach as causal models with feedback loops or latent variables can generally not be characterized using conditional independence constraints alone [8, 27, 31, 32]. Alternatively, structure learning can be approached using score-based search techniques; see, e.g., [3, 28, 30].

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Machine Learning

2006.05978

Country: Europe > Germany (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)

Add feedback

Algebraic tests of general Gaussian latent tree models

Leung, Dennis, Drton, Mathias

Neural Information Processing SystemsDec-31-2018

We consider general Gaussian latent tree models in which the observed variables are not restricted to be leaves of the tree. Extending related recent work, we give a full semi-algebraic description of the set of covariance matrices of any such model. In other words, we find polynomial constraints that characterize when a matrix is the covariance matrix of a distribution in a given latent tree model. However, leveraging these constraints to test a given such model is often complicated by the number of constraints being large and by singularities of individual polynomials, which may invalidate standard approximations to relevant probability distributions. Illustrating with the star tree, we propose a new testing methodology that circumvents singularity issues by trading off some statistical estimation efficiency and handles cases with many constraints through recent advances on Gaussian approximation for maxima of sums of high-dimensional random vectors. Our test avoids the need to maximize the possibly multimodal likelihood function of such models and is applicable to models with larger number of variables. These points are illustrated in numerical experiments.

constraint, decision tree learning, survey article, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Add feedback

Algebraic tests of general Gaussian latent tree models

Leung, Dennis, Drton, Mathias

Neural Information Processing SystemsDec-31-2018

We consider general Gaussian latent tree models in which the observed variables are not restricted to be leaves of the tree. Extending related recent work, we give a full semi-algebraic description of the set of covariance matrices of any such model. In other words, we find polynomial constraints that characterize when a matrix is the covariance matrix of a distribution in a given latent tree model. However, leveraging these constraints to test a given such model is often complicated by the number of constraints being large and by singularities of individual polynomials, which may invalidate standard approximations to relevant probability distributions. Illustrating with the star tree, we propose a new testing methodology that circumvents singularity issues by trading off some statistical estimation efficiency and handles cases with many constraints through recent advances on Gaussian approximation for maxima of sums of high-dimensional random vectors. Our test avoids the need to maximize the possibly multimodal likelihood function of such models and is applicable to models with larger number of variables. These points are illustrated in numerical experiments.

constraint, decision tree learning, survey article, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Add feedback

Generalized Score Matching for Non-Negative Data

Yu, Shiqing, Drton, Mathias, Shojaie, Ali

arXiv.org Machine LearningDec-26-2018

A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation may be implemented using numerical integration, the approach becomes computationally intensive. The score matching method of Hyv\"arinen [2005] avoids direct calculation of the normalizing constant and yields closed-form estimates for exponential families of continuous distributions over $\mathbb{R}^m$. Hyv\"arinen [2007] extended the approach to distributions supported on the non-negative orthant, $\mathbb{R}_+^m$. In this paper, we give a generalized form of score matching for non-negative data that improves estimation efficiency. As an example, we consider a general class of pairwise interaction models. Addressing an overlooked inexistence problem, we generalize the regularized score matching method of Lin et al. [2016] and improve its theoretical guarantees for non-negative Gaussian graphical models.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

1812.10551

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.45)

Add feedback

Structure Learning in Graphical Modeling

Drton, Mathias, Maathuis, Marloes H.

arXiv.org Machine LearningJun-7-2016

A graphical model is a statistical model that is associated to a graph whose nodes correspond to variables of interest. The edges of the graph reflect allowed conditional dependencies among the variables. Graphical models admit computationally convenient factorization properties and have long been a valuable tool for tractable modeling of multivariate distributions. More recently, applications such as reconstructing gene regulatory networks from gene expression data have driven major advances in structure learning, that is, estimating the graph underlying a model. We review some of these advances and discuss methods such as the graphical lasso and neighborhood selection for undirected graphical models (or Markov random fields), and the PC algorithm and score-based search methods for directed graphical models (or Bayesian networks). We further review extensions that account for effects of latent variables and heterogeneous data sources.

bayesian inference, graph, survey article, (18 more...)

arXiv.org Machine Learning

1606.02359

Country:

North America > United States > Oregon > Benton County > Corvallis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Marginal likelihood and model selection for Gaussian latent tree and forest models

Drton, Mathias, Lin, Shaowei, Weihs, Luca, Zwiernik, Piotr

arXiv.org Machine LearningDec-22-2015

Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference. This provides the information needed for a recently introduced generalization of the Bayesian information criterion. Our mathematical developments treat the general setting of Laplace integrals whose phase functions are sums of squared differences between monomials and constants. We clarify how in this case real log-canonical thresholds can be computed using polyhedral geometry, and we show how to apply the general theory to the Laplace integrals associated with Gaussian latent tree and forest models. In simulations and a data example, we demonstrate how the mathematical knowledge can be applied in model selection.

bayesian inference, gaussian latent tree, us government, (17 more...)

arXiv.org Machine Learning

1412.8285

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Order-invariant prior specification in Bayesian factor analysis

Leung, Dennis, Drton, Mathias

arXiv.org Machine LearningSep-26-2014

In (exploratory) factor analysis, the loading matrix is identified only up to orthogonal rotation. For identifiability, one thus often takes the loading matrix to be lower triangular with positive diagonal entries. In Bayesian inference, a standard practice is then to specify a prior under which the loadings are independent, the off-diagonal loadings are normally distributed, and the diagonal loadings follow a truncated normal distribution. This prior specification, however, depends in an important way on how the variables and associated rows of the loading matrix are ordered. We show how a minor modification of the approach allows one to compute with the identifiable lower triangular loading matrix but maintain invariance properties under reordering of the variables.

artificial intelligence, bayesian inference, factor analysis, (16 more...)

arXiv.org Machine Learning

1409.7672

Country: North America > United States > Washington > King County > Seattle (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Robust Graphical Modeling with t-Distributions

Finegold, Michael A., Drton, Mathias

arXiv.org Machine LearningAug-9-2014

Graphical Gaussian models have proven to be useful tools for exploring network structures based on multivariate data. Applications to studies of gene expression have generated substantial interest in these models, and resulting recent progress includes the development of fitting methodology involving penalization of the likelihood function. In this paper we advocate the use of the multivariate t and related distributions for more robust inference of graphs. In particular, we demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a simple and computationally efficient approach to model selection in the t-distribution case.

health & medicine, survey article, tlasso, (21 more...)

arXiv.org Machine Learning

1408.2033

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Nonparametric Reduced Rank Regression

Foygel, Rina, Horrell, Michael, Drton, Mathias, Lafferty, John

arXiv.org Machine LearningJan-9-2013

We propose an approach to multivariate nonparametric regression that generalizes reduced rank regression for linear models. An additive model is estimated for each dimension of a $q$-dimensional response, with a shared $p$-dimensional predictor variable. To control the complexity of the model, we employ a functional form of the Ky-Fan or nuclear norm, resulting in a set of function estimates that have low rank. Backfitting algorithms are derived and justified using a nonparametric form of the nuclear norm subdifferential. Oracle inequalities on excess risk are derived that exhibit the scaling behavior of the procedure in the high dimensional setting. The methods are illustrated on gene expression data.

artificial intelligence, health & medicine, penalty, (17 more...)

arXiv.org Machine Learning

1301.1919

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback