AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Option-critic in cooperative multi-agent systems

Chakravorty, Jhelum, Ward, Nadeem, Roy, Julien, Chevalier-Boisvert, Maxime, Basu, Sumana, Lupu, Andrei, Precup, Doina

arXiv.org Artificial IntelligenceNov-28-2019

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems using the options framework (Sutton et al, 1999) and provide a model-free algorithm for this problem. First, we address the planning problem for the decentralized POMDP represented by the multi-agent system, by introducing a common information approach. We use common beliefs and broadcasting to solve an equivalent centralized POMDP problem. Then, we propose the Distributed Option Critic (DOC) algorithm, motivated by the work of Bacon et al (2017) in the single-agent setting. Our approach uses centralized option evaluation and decentralized intra-option improvement. We analyze theoretically the asymptotic convergence of DOC and validate its performance in grid-world environments, where we implement DOC using a deep neural network. Our experiments show that DOC performs competitively with state-of-the-art algorithms and that it is scalable when the number of agents increases.

agent, dec-pomdp, information, (14 more...)

arXiv.org Artificial Intelligence

1911.12825

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.91)
(2 more...)

Add feedback

Optimal Estimation of Change in a Population of Parameters

Vinayak, Ramya Korlakai, Kong, Weihao, Kakade, Sham M.

arXiv.org Machine LearningNov-28-2019

Paired estimation of change in parameters of interest over a population plays a central role in several application domains including those in the social sciences, epidemiology, medicine and biology. In these domains, the size of the population under study is often very large, however, the number of observations available per individual in the population is very small (\emph{sparse observations}) which makes the problem challenging. Consider the setting with $N$ independent individuals, each with unknown parameters $(p_i, q_i)$ drawn from some unknown distribution on $[0, 1]^2$. We observe $X_i \sim \text{Bin}(t, p_i)$ before an event and $Y_i \sim \text{Bin}(t, q_i)$ after the event. Provided these paired observations, $\{(X_i, Y_i) \}_{i=1}^N$, our goal is to accurately estimate the \emph{distribution of the change in parameters}, $\delta_i := q_i - p_i$, over the population and properties of interest like the \emph{$\ell_1$-magnitude of the change} with sparse observations ($t\ll N$). We provide \emph{information theoretic lower bounds} on the error in estimating the distribution of change and the $\ell_1$-magnitude of change. Furthermore, we show that the following two step procedure achieves the optimal error bounds: first, estimate the full joint distribution of the paired parameters using the maximum likelihood estimator (MLE) and then estimate the distribution of change and the $\ell_1$-magnitude of change using the joint MLE. Notably, and perhaps surprisingly, these error bounds are of the same order as the minimax optimal error bounds for learning the \emph{full} joint distribution itself (in Wasserstein-1 distance); in other words, estimating the magnitude of the change of parameters over the population is, in a minimax sense, as difficult as estimating the full joint distribution itself.

estimator, joint distribution, polynomial, (16 more...)

arXiv.org Machine Learning

1911.12568

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Conditional Hierarchical Bayesian Tucker Decomposition

Sandler, Adam, Klabjan, Diego, Luo, Yuan

arXiv.org Machine LearningNov-27-2019

Our research focuses on studying and developing methods for reducing the dimensionality of large datasets, common in biomedical applications. A major problem when learning information about patients based on genetic sequencing data is that there are often more feature variables (genetic data) than observations (patients). This makes direct supervised learning difficult. One way of reducing the feature space is to use latent Dirichlet allocation in order to group genetic variants in an unsupervised manner. Latent Dirichlet allocation is a common model in natural language processing, which describes a document as a mixture of topics, each with a probability of generating certain words. This can be generalized as a Bayesian tensor decomposition to account for multiple feature variables. While we made some progress improving and modifying these methods, our significant contributions are with hierarchical topic modeling. We developed distinct methods of incorporating hierarchical topic modeling, based on nested Chinese restaurant processes and Pachinko Allocation Machine, into Bayesian tensor decompositions. We apply these models to predict whether or not patients have autism spectrum disorder based on genetic sequencing data. We examine a dataset from National Database for Autism Research consisting of paired siblings -- one with autism, and the other without -- and counts of their genetic variants. Additionally, we linked the genes with their Reactome biological pathways. We combine this information into a tensor of patients, counts of their genetic variants, and the membership of these genes in pathways. Once we decompose this tensor, we use logistic regression on the reduced features in order to predict if patients have autism. We also perform a similar analysis of a dataset of patients with one of four common types of cancer (breast, lung, prostate, and colorectal).

decomposition, probability, topic model, (17 more...)

arXiv.org Machine Learning

1911.12426

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

Eshragh, Ali, Roosta, Fred, Nazari, Asef, Mahoney, Michael W.

arXiv.org Machine LearningNov-27-2019

We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}(\varepsilon))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach. To the best of our knowledge, this paper is the first attempt to establish a nexus between RandNLA and big time series data analysis.

algorithm, leverage score, matrix, (17 more...)

arXiv.org Machine Learning

1911.12321

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Oceania > Australia > Queensland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Defending Against Adversarial Machine Learning

Jenkins, Alison

arXiv.org Artificial IntelligenceNov-26-2019

An Adversarial System to attack and an Authorship Attribution System (AAS) to defend itself against the attacks are analyzed. Defending a system against attacks from an adversarial machine learner can be done by randomly switching between models for the system, by detecting and reacting to changes in the distribution of normal inputs, or by using other methods. Adversarial machine learning is used to identify a system that is being used to map system inputs to outputs. Three types of machine learners are using for the model that is being attacked. The machine learners that are used to model the system being attacked are a Radial Basis Function Support Vector Machine, a Linear Support Vector Machine, and a Feedforward Neural Network. The feature masks are evolved using accuracy as the fitness measure. The system defends itself against adversarial machine learning attacks by identifying inputs that do not match the probability distribution of normal inputs. The system also defends itself against adversarial attacks by randomly switching between the feature masks being used to map system inputs to outputs.

accuracy, algorithm, feature mask, (12 more...)

arXiv.org Artificial Intelligence

1911.11746

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

ART: A machine learning Automated Recommendation Tool for synthetic biology

Radivojević, Tijana, Costello, Zak, Martin, Hector Garcia

arXiv.org Machine LearningNov-25-2019

Synthetic biology allows us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels or anticancer drugs. However, traditional synthetic biology approaches involve ad-hoc non systematic engineering practices, which lead to long development times. Here, we present the Automated Recommendation Tool ( ART), a tool that leverages machine learning and probabilistic modeling techniques to guide synthetic biology in a systematic fashion, without the need for a full mechanistic understanding of the biological system. Using sampling-based optimization, ART provides a set of recommended strains to be built in the next engineering cycle, alongside probabilistic predictions of their production levels. We demonstrate the capabilities of ART on simulated and real data sets and discuss possible difficulties in achieving satisfactory predictive power. 2 Introduction Metabolic engineering 1 enables us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels 2,3 or anticancer drugs.

dbtl cycle, prediction, recommendation, (15 more...)

arXiv.org Machine Learning

1911.11091

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Europe > Portugal > Porto > Porto (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable > Biofuel (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Coefficient of Determination for Probabilistic Topic Models

Jones, Tommy

arXiv.org Machine LearningNov-25-2019

--This research proposes a new (old) metric for evaluating goodness of fit in topic models, the coefficient of determination, or R 2 . Within the context of topic modeling, R 2 has the same interpretation that it does when used in a broader class of statistical models. Reporting R 2 with topic models addresses two current problems in topic modeling: a lack of standard cross-contextual evaluation metrics for topic modeling and ease of communication with lay audiences. The author proposes that R 2 should be reported as a standard metric when constructing topic models. I NTRODUCTION According to an often-quoted but never cited definition, "the goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question." 1 Goodness of fit measures vary with the goals of those constructing the statistical model. Inferential goals may emphasize in-sample fit while predictive goals may emphasize out-of-sample fit. Prior information may be included in the goodness of fit measure for Bayesian models, or it may not. Goodness of fit measures may include methods to correct for model overfitting. In short, goodness of fit measures the performance of a statistical model against the ground truth of observed data. Fitting the data well is generally a necessary--though not sufficient--condition for trust in a statistical model, whatever its goals. Of course, goodness of fit is only one concern in statistical modeling.

corpus, likelihood, topic model, (14 more...)

arXiv.org Machine Learning

1911.11061

Country:

Asia > Middle East > Jordan (0.05)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.95)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Resampling-based Confidence Intervals for Model-free Robust Inference on Optimal Treatment Regimes

Wu, Yunan, Wang, Lan

arXiv.org Machine LearningNov-25-2019

Recently, there has been growing interest in estimating optimal treatment regimes which are individualized decision rules that can achieve maximal average outcomes. This paper considers the problem of inference for optimal treatment regimes in the model-free setting, where the specification of an outcome regression model is not needed. Existing model-free estimators are usually not suitable for the purpose of inference because they either have nonstandard asymptotic distributions, or are designed to achieve fisher-consistent classification performance. This paper first studies a smoothed robust estimator that directly targets estimating the parameters corresponding to the Bayes decision rule for estimating the optimal treatment regime. This estimator is shown to have an asymptotic normal distribution. Furthermore, it is proved that a resampling procedure provides asymptotically accurate inference for both the parameters indexing the optimal treatment regime and the optimal value function. A new algorithm is developed to calculate the proposed estimator with substantially improved speed and stability. Numerical results demonstrate the satisfactory performance of the new methods.

estimator, optimal treatment regime, treatment regime, (15 more...)

arXiv.org Machine Learning

1911.11043

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.27)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Matrix Normal PCA for Interpretable Dimension Reduction and Graphical Noise Modeling

Zhang, Chihao, Gai, Kuo, Zhang, Shihua

arXiv.org Machine LearningNov-25-2019

Principal component analysis (PCA) is one of the most widely used dimension reduction and multivariate statistical techniques. From a probabilistic perspective, PCA seeks a low-dimensional representation of data in the presence of independent identical Gaussian noise. Probabilistic PCA (PPCA) and its variants have been extensively studied for decades. Most of them assume the underlying noise follows a certain independent identical distribution. However, the noise in the real world is usually complicated and structured. To address this challenge, some non-linear variants of PPCA have been proposed. But those methods are generally difficult to interpret. To this end, we propose a powerful and intuitive PCA method (MN-PCA) through modeling the graphical noise by the matrix normal distribution, which enables us to explore the structure of noise in both the feature space and the sample space. MN-PCA obtains a low-rank representation of data and the structure of noise simultaneously. And it can be explained as approximating data over the generalized Mahalanobis distance. We develop two algorithms to solve this model: one maximizes the regularized likelihood, the other exploits the Wasserstein distance, which is more robust. Extensive experiments on various data demonstrate their effectiveness.

matrix, noise, precision matrix, (16 more...)

arXiv.org Machine Learning

1911.10796

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(2 more...)

Add feedback

Machine Learning-based Signal Detection for PMH Signals in Load-modulated MIMO System

Zhu, Jinle, Li, Qiang, Hu, Li, Chen, Hongyang, Ansari, Nirwan

arXiv.org Machine LearningNov-24-2019

Phase Modulation on the Hypersphere (PMH) is a power efficient modulation scheme for the load-modulated multiple-input multiple-output (MIMO) transmitters with central power amplifiers (CP A). However, it is difficult to obtain the precise channel state information (CSI), and the traditional optimal maximum likelihood (ML) detection scheme incurs high complexity which increases exponentially with the number of antennas and the number of bits carried per antenna in the PMH modulation. To detect the PMH signals without knowing the prior CSI, we first propose a signal detection scheme, termed as the hypersphere clustering scheme based on the expectation maximization (EM) algorithm with maximum likelihood detection (HEM-ML). By leveraging machine learning, the proposed detection scheme can accurately obtain information of the channel from a few of the received symbols with little resource cost and achieve comparable detection results as that of the optimal ML detector. To further reduce the computational complexity in the ML detection in HEM-ML, we also propose the second signal detection scheme, termed as the hypersphere clustering scheme based on the EM algorithm with KD-tree detection (HEM-KD). The CSI obtained from the EM algorithm is used to build a spatial KD-tree receiver codebook and the signal detection problem can be transformed into a nearest neighbor search (NNS) problem. The detection complexity of HEM-KD is significantly reduced without any detection performance loss as compared to HEM-ML. Extensive simulation results verify the effectiveness of our proposed detection schemes. I NTRODUCTION The fifth generation (5G) wireless communication network is forecasted to provide over 1000 times higher capacity than the current system. In addition to dramatically expanding the available bandwidth, multiple-input multiple-output (MIMO) technology is playing a key role in improving the spectral efficiency (SE) and enhancing the throughput in the future wireless cellular communication systems [1]. This ambitious goal will however cause an inevitable energy consumption problem, thus limiting the number of the antennas at the base station (BS) and the user terminals in practice [2]. In the traditional design of the MIMO transceivers, each antenna is connected with one distinct radio frequency (RF) chain which includes a power amplifier (P A). This kind of structure enables the power consumption of the transmission to grow linearly with the number of the antennas. In addition, the use of Orthogonal Frequency Division Multiplexing (OFDM) signals in massive MIMO systems leads to a high peak-to-average power ratios (P APR) and exacerbates the costs of P As, thus reducing the power efficiency. On the other hand, to alleviate the effects of mutual coupling and correlated fading, the antennas should be set at least half of a wavelength apart from each other, which will inevitably cause the size problem [3].

algorithm, complexity, detection scheme, (15 more...)

arXiv.org Machine Learning

1911.13238

Country:

Europe > Germany (0.04)
North America > United States (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry:

Telecommunications (0.34)
Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback