AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Concave Penalized Estimation of Sparse Gaussian Bayesian Networks

Aragam, Bryon, Zhou, Qing

arXiv.org Machine LearningJan-4-2015

We develop a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data. In contrast to recent methods which accelerate the learning problem by restricting the search space, our main contribution is a fast algorithm for score-based structure learning which does not restrict the search space in any way and works on high-dimensional datasets with thousands of variables. Our use of concave regularization, as opposed to the more popular $\ell_0$ (e.g. BIC) penalty, is new. Moreover, we provide theoretical guarantees which generalize existing asymptotic results when the underlying distribution is Gaussian. Most notably, our framework does not require the existence of a so-called faithful DAG representation, and as a result the theory must handle the inherent nonidentifiability of the estimation problem in a novel way. Finally, as a matter of independent interest, we provide a comprehensive comparison of our approach to several standard structure learning methods using open-source packages developed for the R language. Based on these experiments, we show that our algorithm is significantly faster than other competing methods while obtaining higher sensitivity with comparable false discovery rates for high-dimensional data. In particular, the total runtime for our method to generate a solution path of 20 estimates for DAGs with 8000 nodes is around one hour.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1401.0852

Country:

Europe (0.67)
North America > United States > California > Los Angeles County > Los Angeles (0.27)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Learnability of Unknown Quantum Measurements

Cheng, Hao-Chung, Hsieh, Min-Hsiu, Yeh, Ping-Cheng

arXiv.org Machine LearningJan-3-2015

Quantum machine learning has received significant attention in recent years, and promising progress has been made in the development of quantum algorithms to speed up traditional machine learning tasks. In this work, however, we focus on investigating the information-theoretic upper bounds of sample complexity - how many training samples are sufficient to predict the future behaviour of an unknown target function. This kind of problem is, arguably, one of the most fundamental problems in statistical learning theory and the bounds for practical settings can be completely characterised by a simple measure of complexity. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson's famous result [Proc. R. Soc. A 463:3089-3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher complexities are derived explicitly. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks.

complexity, fat-shattering dimension, quantum state, (15 more...)

arXiv.org Machine Learning

1501.00559

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Asia > Taiwan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

A Review of Real-Time Strategy Game AI

Robertson, Glen (University of Aukland) | Watson, Ian (University of Auckland)

AI MagazineJan-2-2015

This literature review covers AI techniques used for real-time strategy video games, focusing specifically on StarCraft. It finds that the main areas of current academic research are in tactical and strategic decision-making, plan recognition, and learning, and it outlines the research contributions in each of these areas. The paper then contrasts the use of game AI in academia and industry, finding the academic research heavily focused on creating game-winning agents, while the indus- try aims to maximise player enjoyment. It finds the industry adoption of academic research is low because it is either in- applicable or too time-consuming and risky to implement in a new game, which highlights an area for potential investi- gation: bridging the gap between academia and industry. Fi- nally, the areas of spatial reasoning, multi-scale AI, and co- operation are found to require future work, and standardised evaluation methods are proposed to produce comparable re- sults between studies.

artificial intelligence, machine learning, starcraft, (18 more...)

AI Magazine

Country:

North America > United States > California (0.29)
North America > United States > Massachusetts (0.28)

Genre: Overview (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
(8 more...)

Add feedback

Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification

Fan, Jianqing, Feng, Yang, Jiang, Jiancheng, Tong, Xin

arXiv.org Machine LearningJan-2-2015

We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.

artificial intelligence, logistic regression, machine learning, (13 more...)

arXiv.org Machine Learning

1401.0211

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.38)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.39)

Add feedback

Causal Inference through a Witness Protection Program

Silva, Ricardo, Evans, Robin

Neural Information Processing SystemsDec-31-2014

One of the most fundamental problems in causal inference is the estimation of a causal effect when variables are confounded. This is difficult in an observational study because one has no direct evidence that all confounders have been adjusted for. We introduce a novel approach for estimating causal effects that exploits observational conditional independencies to suggest ``weak'' paths in a unknown causal graph. The widely used faithfulness condition of Spirtes et al. is relaxed to allow for varying degrees of ``path cancellations'' that will imply conditional independencies but do not rule out the existence of confounding causal paths. The outcome is a posterior distribution over bounds on the average causal effect via a linear programming approach and Bayesian inference. We claim this approach should be used in regular practice to complement other default tools in observational studies.

constraint, faithfulness, figure 1, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.97)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation

Zhong, Mingjun, Goddard, Nigel, Sutton, Charles

Neural Information Processing SystemsDec-31-2014

Blind source separation problems are difficult because they are inherently unidentifiable, yet the entire goal is to identify meaningful sources. We introduce a way of incorporating domain knowledge into this problem, called signal aggregate constraints (SACs). SACs encourage the total signal for each of the unknown sources to be close to a specified value. This is based on the observation that the total signal often varies widely across the unknown sources, and we often have a good idea of what total values to expect. We incorporate SACs into an additive factorial hidden Markov model (AFHMM) to formulate the energy disaggregation problems where only one mixture signal is assumed to be observed. A convex quadratic program for approximate inference is employed for recovering those source signals. On a real-world energy disaggregation data set, we show that the use of SACs dramatically improves the original AFHMM, and significantly improves over a recent state-of-the art approach.

afhmm sac, appliance, constraint, (10 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.48)

Industry: Energy (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Learning Time-Varying Coverage Functions

Du, Nan, Liang, Yingyu, Balcan, Maria-Florina F., Song, Le

Neural Information Processing SystemsDec-31-2014

Coverage functions are an important class of discrete functions that capture the law of diminishing returns arising naturally from applications in social network analysis, machine learning, and algorithmic game theory. In this paper, we propose anew problem of learning time-varying coverage functions, and develop a novel parametrization of these functions using random features. Based on the connection betweentime-varying coverage functions and counting processes, we also propose an efficient parameter learning algorithm based on likelihood maximization, andprovide a sample complexity analysis. We applied our algorithm to the influence function estimation problem in information diffusion in social networks, and show that with few assumptions about the diffusion processes, our algorithm is able to estimate influence significantly more accurately than existing approaches on both synthetic and real world data.

coverage function, tcoveragelearner, time-varying coverage function, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Add feedback

RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

Petrik, Marek, Subramanian, Dharmashankar

Neural Information Processing SystemsDec-31-2014

We describe how to use robust Markov decision processes for value function approximation with state aggregation. The robustness serves to reduce the sensitivity to the approximation error of sub-optimal policies in comparison to classical methods such as fitted value iteration. This results in reducing the bounds on the gamma-discounted infinite horizon performance loss by a factor of 1/(1-gamma) while preserving polynomial-time computational complexity. Our experimental results show that using the robust representation can significantly improve the solution quality with minimal additional computational cost.

aggregation, performance loss, raam, (16 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)

Add feedback

Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing

Zhang, Yuchen, Chen, Xi, Zhou, Dengyong, Jordan, Michael I.

Neural Information Processing SystemsDec-31-2014

The Dawid-Skene estimator has been widely used for inferring the true labels from the noisy labels provided by non-expert crowdsourcing workers. However, since the estimator maximizes a non-convex log-likelihood function, it is hard to theoretically justify its performance. In this paper, we propose a two-stage efficient algorithm for multi-class crowd labeling problems. The first stage uses the spectral method to obtain an initial estimate of parameters. Then the second stage refines the estimation by optimizing the objective function of the Dawid-Skene estimator via the EM algorithm. We show that our algorithm achieves the optimal convergence rate up to a logarithmic factor. We conduct extensive experiments on synthetic and real datasets. Experimental results demonstrate that the proposed algorithm is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods.

algorithm, confusion matrix, probability, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Communications > Social Media > Crowdsourcing (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Analysis of Variational Bayesian Latent Dirichlet Allocation: Weaker Sparsity Than MAP

Nakajima, Shinichi, Sato, Issei, Sugiyama, Masashi, Watanabe, Kazuho, Kobayashi, Hiroko

Neural Information Processing SystemsDec-31-2014

Latent Dirichlet allocation (LDA) is a popular generative model of various objects such as texts and images, where an object is expressed as a mixture of latent topics. In this paper, we theoretically investigate variational Bayesian (VB) learning in LDA. More specifically, we analytically derive the leading term of the VB free energy under an asymptotic setup, and show that there exist transition thresholds in Dirichlet hyperparameters around which the sparsity-inducing behavior drastically changes. Then we further theoretically reveal the notable phenomenon that VB tends to induce weaker sparsity than MAP in the LDA model, which is opposed to other models. We experimentally demonstrate the practical validity of our asymptotic theory on real-world Last.FM music data.

free energy, lda model, sparsity, (13 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.97)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.72)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)

Add feedback