Bayesian Learning
PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
Dumitrascu, Bianca, Feng, Karen, Engelhardt, Barbara E
We address the problem of regret minimization in logistic contextual bandits, where a learner decides among sequential actions or arms given their respective contexts to maximize binary rewards. Using a fast inference procedure with Polya-Gamma distributed augmentation variables, we propose an improved version of Thompson Sampling, a Bayesian formulation of contextual bandits with near-optimal performance. Our approach, Polya-Gamma augmented Thompson Sampling (PG-TS), achieves state-of-the-art performance on simulated and real data. PG-TS explores the action space efficiently and exploits high-reward arms, quickly converging to solutions of low regret. Its explicit estimation of the posterior distribution of the context feature covariance leads to substantial empirical gains over approximate approaches. PG-TS is the first approach to demonstrate the benefits of Polya-Gamma augmentation in bandits and to propose an efficient Gibbs sampler for approximating the analytically unsolvable integral of logistic contextual bandits.
Accurate Kernel Learning for Linear Gaussian Markov Processes using a Scalable Likelihood Computation
We report an exact likelihood computation for Linear Gaussian Markov processes that is more scalable than existing algorithms for complex models and sparsely sampled signals. Better scaling is achieved through elimination of repeated computations in the Kalman likelihood, and by using the diagonalized form of the state transition equation. Using this efficient computation, we study the accuracy of kernel learning using maximum likelihood and the posterior mean in a simulation experiment. The posterior mean with a reference prior is more accurate for complex models and sparse sampling. Because of its lower computation load, the maximum likelihood estimator is an attractive option for more densely sampled signals and lower order models. We confirm estimator behavior in experimental data through their application to speleothem data.
Top 10 Machine Learning Algorithms to Know
Modern advancements in Artificial Intelligence (AI) are set to change our world for the better. These developments have largely been made possible due to technologies such as cloud sharing, data analytics, blockchain, and improved computing power. These technologies have significantly improved machine learning, the main cause driver behind AI advancements. Machine learning is probably the most important component of developing Artificial Intelligence. The process of machine learning involves running repeated simulations on a computer, recording the results, and then running new tests based on the previous outcomes.
Reliable Uncertain Evidence Modeling in Bayesian Networks by Credal Networks
Marchetti, Sabina (La Sapienza University of Rome) | Antonucci, Alessandro (IDSIA)
A reliable modeling of uncertain evidence in Bayesian networks based on a set-valued quantification is proposed. Both soft and virtual evidences are considered. We show that evidence propagation in this setup can be reduced to standard updating in an augmented credal network, equivalent to a set of consistent Bayesian networks. A characterization of the computational complexity for this task is derived together with an efficient exact procedure for a subclass of instances. In the case of multiple uncertain evidences over the same variable, the proposed procedure can provide a set-valued version of the geometric approach to opinion pooling.
The Blessings of Multiple Causes
Causal inference from observation data often assumes "strong ignorability," that all confounders are observed. This assumption is standard yet untestable. However, many scientific studies involve multiple causes, different variables whose effects are simultaneously of interest. We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for when the deconfounder leads to unbiased causal estimates, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genomewide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating close-to-truth causal effects.
Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures
Vilnis, Luke, Li, Xiang, Murty, Shikhar, McCallum, Andrew
Embedding methods which enforce a partial order or lattice structure over the concept space, such as Order Embeddings (OE) (Vendrov et al., 2016), are a natural way to model transitive relational data (e.g. entailment graphs). However, OE learns a deterministic knowledge base, limiting expressiveness of queries and the ability to use uncertainty for both prediction and learning (e.g. learning from expectations). Probabilistic extensions of OE (Lai and Hockenmaier, 2017) have provided the ability to somewhat calibrate these denotational probabilities while retaining the consistency and inductive bias of ordered models, but lack the ability to model the negative correlations found in real-world knowledge. In this work we show that a broad class of models that assign probability measures to OE can never capture negative correlation, which motivates our construction of a novel box lattice and accompanying probability measure to capture anticorrelation and even disjoint concepts, while still providing the benefits of probabilistic modeling, such as the ability to perform rich joint and conditional queries over arbitrary sets of concepts, and both learning from and predicting calibrated uncertainty. We show improvements over previous approaches in modeling the Flickr and WordNet entailment graphs, and investigate the power of the model.
Learning is Compiling: Experience Shapes Concept Learning by Combining Primitives in a Language of Thought
Tano, Pablo, Romano, Sergio, Sigman, Mariano, Salles, Alejo, Figueira, Santiago
Recent approaches to human concept learning have successfully combined the power of symbolic, infinitely productive, rule systems and statistical learning. The aim of most of these studies is to reveal the underlying language structuring these representations and providing a general substrate for thought. Here, we ask about the plasticity of symbolic descriptive languages. We perform two concept learning experiments, that consistently demonstrate that humans can change very rapidly the repertoire of symbols they use to identify concepts, by compiling expressions which are frequently used into new symbols of the language. The pattern of concept learning times is accurately described by a Bayesian agent that rationally updates the probability of compiling a new expression according to how useful it has been to compress concepts so far. By portraying the Language of Thought as a flexible system of rules, we also highlight the intrinsic difficulties to pin it down empirically. Keywords: Language of Thought, Concept Learning, Probabilistic Inference 1. Introduction How can children acquire a vast universe of concepts with seemingly very little exposure? Preprint submitted to Cognitive Psychology. Combinatorial languages can describe a vast set of concepts from a small set of primitives. This can be understood in a relatively simple example in the domain of shapes. A combinatorial and symbolic language similar to Logo [5] can combine operations such as "move", "pen up", "pen down" or "rotate" to generate an infinite set of expressions (or programs) which, when evaluated, can convey all sort of shapes.
Bayesian Statistics Coursera
About this course: This course describes Bayesian statistics, in which one's inferences about parameters or hypotheses are updated as evidence accumulates. You will learn to use Bayes' rule to transform prior probabilities into posterior probabilities, and be introduced to the underlying theory and perspective of the Bayesian paradigm. The course will apply Bayesian methods to several practical problems, to show end-to-end Bayesian analyses that move from framing the question to building models to eliciting prior probabilities to implementing in R (free statistical software) the final posterior distribution. Additionally, the course will introduce credible regions, Bayesian comparisons of means and proportions, Bayesian regression and inference using multiple models, and discussion of Bayesian prediction. We assume learners in this course have background knowledge equivalent to what is covered in the earlier three courses in this specialization: "Introduction to Probability and Data," "Inferential Statistics," and "Linear Regression and Modeling."
Using IoT, AI and cloud to advance home-based integrated care
One of the largest growing demographics in the EU is individuals aged 65 and over, and two thirds of this group are in situation of multimorbidity, i.e., perons who suffer from two or more chronic diseases. The ineffective treatment of multimorbidity has been pointed out as an urgent problem to address by the Academy of Medical Sciences in a recently released report. As part of an EU H2020 funded project called ProACT, our team at IBM Research – Ireland is working with partners in academia and industry to find new ways to use IoT, AI and cloud technologies to advance self-management capabilities and home-based integrated care for Persons with Multimorbidity (PwM). The ProACT project is investigating ways wearable, home sensors and tablet applications can be used to help persons with multimorbidity, as well as their support actors, which include informal caregivers (e.g. The project includes proof-of-concept trials in Ireland and Belgium, involving national health services, with a number of patients equipped with wearable and home sensors, and their support actors.