AITopics

1905.12341

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)

arXiv.org Machine LearningMay-29-2019

Efficient EM-Variational Inference for Hawkes Process

Zhou, Feng

In classical Hawkes process, the baseline intensity and triggering kernel are assumed to be a constant and parametric function respectively, which limits the model flexibility. To generalize it, we present a fully Bayesian nonparametric model, namely Gaussian process modulated Hawkes process and propose an EM-variational inference scheme. In this model, a transformation of Gaussian process is used as a prior on the baseline intensity and triggering kernel. By introducing a latent branching structure, the inference of baseline intensity and triggering kernel is decoupled and the variational inference scheme is embedded into an EM framework naturally. We also provide a series of schemes to accelerate the inference. Results of synthetic and real data experiments show that the underlying baseline intensity and triggering kernel can be recovered without parametric restriction and our Bayesian nonparametric estimation is superior to other state of the arts.

artificial intelligence, hawke process, machine learning, (19 more...)

1905.12251

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Gimenez, Jaime Roquero, Ghorbani, Amirata, Zou, James

Knockoffs for the mass: new feature importance statistics with false discovery guarantees

An important problem in machine learning and statistics is to identify features that causally affect the outcome. This is often impossible to do from purely observational data, and a natural relaxation is to identify features that are correlated with the outcome even conditioned on all other observed features. For example, we want to identify that smoking really is correlated with cancer conditioned on demographics. The knockoff procedure is a recent breakthrough in statistics that, in theory, can identify truly correlated features while guaranteeing that the false discovery is limited. The idea is to create synthetic data -- knockoffs -- that captures correlations amongst the features. However there are substantial computational and practical challenges to generating and using knockoffs. This paper makes several key advances that enable knockoff application to be more efficient and powerful. We develop an efficient algorithm to generate valid knockoffs from Bayesian Networks. Then we systematically evaluate knockoff test statistics and develop new statistics with improved power. The paper combines new mathematical guarantees with systematic experiments on real and synthetic data.

artificial intelligence, knockoff, machine learning, (17 more...)

1807.06214

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)

Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex

Cui, Yufei, Yao, Wuguannan, Li, Qiao, Chan, Antoni B., Xue, Chun Jason

Estimating the uncertainty of a Bayesian model has been investigated for decades. The model posterior is almost always intractable, such that approximation is necessary. In many real-world cases, even though a decent estimation of the model posterior is obtained, another approximation is required to compute the predictive distribution over the desired output. A common accurate solution is to use Monte Carlo (MC) integration. However, it needs to maintain a large number of samples, evaluate the model repeatedly and average multiple model outputs. In this paper, we propose a method to approximate the probability distribution over the simplex induced by model posterior, enabling tractable computation of the predictive distribution for classification. The aim is to approximate the induced uncertainty of a specific Bayesian model, meanwhile alleviating the heavy workload of MC integration in testing time. Methodologically, we adapt Wasserstein distance to learn the induced conditional distributions, which is novel for Bayesian learning. The proposed method is universally applicable to Bayesian classification models that allow for posterior sampling. Empirical results validate the strong practical performance of our approach.

artificial intelligence, ent, machine learning, (17 more...)

1905.12194

Country:

Asia > China > Hong Kong (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

de Haan, Pim, Jayaraman, Dinesh, Levine, Sergey

Causal Confusion in Imitation Learning

Behavioral cloning reduces policy learning to supervised learning by training a discriminative model to predict expert actions given observations. Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment. We point out that ignoring causality is particularly damaging because of the distributional shift in imitation learning. In particular, it leads to a counter-intuitive "causal confusion" phenomenon: access to more information can yield worse performance. We investigate how this problem arises, and propose a solution to combat it through targeted interventions---either environment interaction or expert queries---to determine the correct causal model. We show that causal confusion occurs in several benchmark control domains as well as realistic driving settings, and validate our solution against DAgger and other baselines and ablations.

artificial intelligence, demonstration dataset size, machine learning, (13 more...)

1905.11979

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Steinberg, Ethan, Liu, Peter J.

Using Ontologies To Improve Performance In Massively Multi-label Prediction Models

arXiv.org Artificial IntelligenceMay-28-2019

Massively multi-label prediction/classification problems arise in environments like health-care or biology where very precise predictions are useful. One challenge with massively multi-label problems is that there is often a long-tailed frequency distribution for the labels, which results in few positive examples for the rare labels. We propose a solution to this problem by modifying the output layer of a neural network to create a Bayesian network of sigmoids which takes advantage of ontology relationships between the labels to help share information between the rare and the more common labels. We apply this method to the two massively multi-label tasks of disease prediction (ICD-9 codes) and protein function prediction (Gene Ontology terms) and obtain significant improvements in per-label AUROC and average precision for less common labels.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1905.12126

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Guggilam, Sreelekha, Zaidi, S. M. Arshad, Chandola, Varun, Patra, Abani

Bayesian Anomaly Detection Using Extreme Value Theory

Data-driven anomaly detection methods typically build a model for the normal behavior of the target system, and score each data instance with respect to this model. A threshold is invariably needed to identify data instances with high (or low) scores as anomalies. This presents a practical limitation on the applicability of such methods, since most methods are sensitive to the choice of the threshold, and it is challenging to set optimal thresholds. We present a probabilistic framework to explicitly model the normal and anomalous behaviors and probabilistically reason about the data. An extreme value theory based formulation is proposed to model the anomalous behavior as the extremes of the normal behavior. As a specific instantiation, a joint non-parametric clustering and anomaly detection algorithm (INCAD) is proposed that models the normal behavior as a Dirichlet Process Mixture Model. A pseudo-Gibbs sampling based strategy is used for inference. Results on a variety of data sets show that the proposed method provides effective clustering and anomaly detection without requiring strong initialization and thresholding parameters.

data mining, detection, machine learning, (17 more...)

1905.1215

Country: North America > United States (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Glynn, Christopher, He, Jingyu, Polson, Nicholas G., Xu, Jianeng

Bayesian Inference for Polya Inverse Gamma Models

The normalizing constants of these distributions depend on gamma functions whose arguments include shape (gamma, inverse gamma) and concentration (beta, Dirichlet) parameters. Bayesian learning of parameters nested inside the gamma function presents significant technical difficulties, since there is no known conjugate prior distribution. In fact, inferring the shape parameter in the gamma distribution is a long-studied problem in Bayesian inference (Damsleth, 1975; Rossell et al., 2009; Miller, 2018). In this paper, we develop the theoretical and algorithmic foundation of a P olya-inverse Gamma (PIG) data augmentation scheme for fully Bayesian inference of shape and concentration parameters in gamma, inverse gamma, and Dirichlet models, respectively . PIG data augmentation may be utilized to design efficient Markov chain Monte Carlo (MCMC) algorithms in latent Dirichlet allocation (Blei et al., 2003), Beta-negative binomial models (Zhou et al., 2012), and Gamma-Gamma (GaGa) hierarchical models (Rossell et al., 2009).

artificial intelligence, bayesian inference, machine learning, (16 more...)

1905.12141

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Roeder, Geoffrey, Grant, Paul K., Phillips, Andrew, Dalchau, Neil, Meeds, Edward

Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems

We introduce a flexible, scalable Bayesian inference framework for nonlinear dynamical systems characterised by distinct and hierarchical variability at the individual, group, and population levels. Our model class is a generalisation of nonlinear mixed-effects (NLME) dynamical systems, the statistical workhorse for many experimental sciences. We cast parameter inference as stochastic optimisation of an end-to-end differentiable, block-conditional variational autoencoder. We specify the dynamics of the data-generating process as an ordinary differential equation (ODE) such that both the ODE and its solver are fully differentiable. This model class is highly flexible: the ODE right-hand sides can be a mixture of user-prescribed or "white-box" sub-components and neural network or "black-box" sub-components. Using stochastic optimisation, our amortised inference algorithm could seamlessly scale up to massive data collection pipelines (common in labs with robotic automation). Finally, our framework supports interpretability with respect to the underlying dynamics, as well as predictive generalization to unseen combinations of group components (also called "zero-shot" learning). We empirically validate our method by predicting the dynamic behaviour of bacteria that were genetically engineered to function as biosensors.

artificial intelligence, bayesian inference, machine learning, (14 more...)

1905.1209

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)

Stirn, Andrew, Jebara, Tony, Knowles, David A

A New Distribution on the Simplex with Auto-Encoding Applications

We construct a new distribution for the simplex using the Kumaraswamy distribution and an ordered stick-breaking process. We explore and develop the theoretical properties of this new distribution and prove that it exhibits symmetry under the same conditions as the well-known Dirichlet. Like the Dirichlet, the new distribution is adept at capturing sparsity but, unlike the Dirichlet, has an exact and closed form reparameterization--making it well suited for deep variational Bayesian modeling. We demonstrate the distribution's utility in a variety of semi-supervised auto-encoding tasks. In all cases, the resulting models achieve competitive performance commensurate with their simplicity, use of explicit probability models, and abstinence from adversarial training.

algorithm 1, artificial intelligence, machine learning, (18 more...)

1905.12052

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)