Bayesian Inference
WatsonPaths: Scenario-Based Question Answering and Inference over Unstructured Information
Lally, Adam (Information Technology and Services) | Bagchi, Sugato (IBM Research) | Barborak, Michael A. (IBM T. J. Watson Research Center) | Buchanan, David W. (IBM T. J. Watson Research Center) | Chu-Carroll, Jennifer (IBM Research) | Ferrucci, David A. (Bridgewater) | Glass, Michael R. (IBM Research) | Kalyanpur, Aditya (IBM T. J. Watson Research Center) | Mueller, Erik T. (Capital One) | Murdock, J. William (IBM T. J. Watson Research Center) | Patwardhan, Siddharth (IBM T. J. Watson Research Center) | Prager, John M. (IBM T. J. Watson Research Center)
We present WatsonPaths, a novel system that can answer scenario-based questions. These include medical questions that present a patient summary and ask for the most likely diagnosis or most appropriate treatment. WatsonPaths builds on the IBM Watson question answering system. WatsonPaths breaks down the input scenario into individual pieces of information, asks relevant subquestions of Watson to conclude new information, and represents these results in a graphical model. Probabilistic inference is performed over the graph to conclude the answer. On a set of medical test preparation questions, WatsonPaths shows a significant improvement in accuracy over multiple baselines.
AI โ The Present in the Making
I attended the Huawei European Innovation Day recently, and was enthralled by how the new technology is giving rise to industrial revolutions. These revolutions are what will eventually unlock the development potential around the world. It is important to leverage the emerging technologies, since they are the resources which will lead us to innovation and progress. Huawei is innovative in its partnerships and collaboration to define the future, and the event was a huge success. For many people, the concept of Artificial Intelligence (AI) is a thing of the future. It is the technology that has yet to be introduced.
Probabilistic Active Learning of Functions in Structural Causal Models
Rubenstein, Paul K., Tolstikhin, Ilya, Hennig, Philipp, Schoelkopf, Bernhard
We consider the problem of learning the functions computing children from parents in a Structural Causal Model once the underlying causal graph has been identified. This is in some sense the second step after causal discovery. Taking a probabilistic approach to estimating these functions, we derive a natural myopic active learning scheme that identifies the intervention which is optimally informative about all of the unknown functions jointly, given previously observed data. We test the derived algorithms on simple examples, to demonstrate that they produce a structured exploration policy that significantly improves on unstructured base-lines.
AI โ The Present in the Making -
I attended the Huawei European Innovation Day recently, and was enthralled by how the new technology is giving rise to industrial revolutions. These revolutions are what will eventually unlock the development potential around the world. It is important to leverage the emerging technologies, since they are the resources which will lead us to innovation and progress. Huawei is innovative in its partnerships and collaboration to define the future, and the event was a huge success. For many people, the concept of Artificial Intelligence (AI) is a thing of the future.
Deriving Probability Density Functions from Probabilistic Functional Programs
Bhat, Sooraj, Borgstrรถm, Johannes, Gordon, Andrew D., Russo, Claudio
The probability density function of a probability distribution is a fundamental concept in probability theory and a key ingredient in various widely used machine learning methods. However, the necessary framework for compiling probabilistic functional programs to density functions has only recently been developed. In this work, we present a density compiler for a probabilistic language with failure and both discrete and continuous distributions, and provide a proof of its soundness. The compiler greatly reduces the development effort of domain experts, which we demonstrate by solving inference problems from various scientific applications, such as modelling the global carbon cycle, using a standard Markov chain Monte Carlo framework.
Towards Bursting Filter Bubble via Contextual Risks and Uncertainties
Takahashi, Rikiya, Zhang, Shunan
A rising topic in computational journalism is how to enhance the diversity in news served to subscribers to foster exploration behavior in news reading. Despite the success of preference learning in personalized news recommendation, their over-exploitation causes filter bubble that isolates readers from opposing viewpoints and hurts long-term user experiences with lack of serendipity. Since news providers can recommend neither opposite nor diversified opinions if unpopularity of these articles is surely predicted, they can only bet on the articles whose forecasts of click-through rate involve high variability (risks) or high estimation errors (uncertainties). We propose a novel Bayesian model of uncertainty-aware scoring and ranking for news articles. The Bayesian binary classifier models probability of success (defined as a news click) as a Beta-distributed random variable conditional on a vector of the context (user features, article features, and other contextual features). The posterior of the contextual coefficients can be computed efficiently using a low-rank version of Laplace's method via thin Singular Value Decomposition. Efficiencies in personalized targeting of exceptional articles, which are chosen by each subscriber in test period, are evaluated on real-world news datasets. The proposed estimator slightly outperformed existing training and scoring algorithms, in terms of efficiency in identifying successful outliers.
Bayesian Semisupervised Learning with Deep Generative Models
Gordon, Jonathan, Hernรกndez-Lobato, Josรฉ Miguel
Neural network based generative models with discriminative components are a powerful approach for semi-supervised learning. However, these techniques a) cannot account for model uncertainty in the estimation of the model's discriminative component and b) lack flexibility to capture complex stochastic patterns in the label generation process. To avoid these problems, we first propose to use a discriminative component with stochastic inputs for increased noise flexibility. We show how an efficient Gibbs sampling procedure can marginalize the stochastic inputs when inferring missing labels in this model. Following this, we extend the discriminative component to be fully Bayesian and produce estimates of uncertainty in its parameter values. This opens the door for semi-supervised Bayesian active learning.
Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data
Mikalsen, Karl รyvind, Bianchi, Filippo Maria, Soguero-Ruiz, Cristina, Jenssen, Robert
Similarity-based approaches represent a promising direction for time series analysis. However, many such methods rely on parameter tuning, and some have shortcomings if the time series are multivariate (MTS), due to dependencies between attributes, or the time series contain missing data. In this paper, we address these challenges within the powerful context of kernel methods by proposing the robust \emph{time series cluster kernel} (TCK). The approach taken leverages the missing data handling properties of Gaussian mixture models (GMM) augmented with informative prior distributions. An ensemble learning approach is exploited to ensure robustness to parameters by combining the clustering results of many GMM to form the final kernel. We evaluate the TCK on synthetic and real data and compare to other state-of-the-art techniques. The experimental results demonstrate that the TCK is robust to parameter choices, provides competitive results for MTS without missing data and outstanding results for missing data.
Approximation of probability density functions on the Euclidean group parametrized by dual quaternions
Perception is fundamental to many robot application areas especially in service robotics. Our aim is to perceive and model an unprepared kitchen scenario with many objects. We start with the perception of a single target object. The modeling relies especially on fusing and merging of weak information from the sensors of the robot in order to localize objects. This requires the representation of various probability distributions of pose in $S_3 \times \mathbb{R}^3$ as orientation and position have to be localized. In this thesis I present a framework for probabilistic modeling of poses in $S_3 \times \mathbb{R}^3$ that represents a large class of probability distributions and provides among others the operations of the fusion and the merge of estimates. Further it offers the propagation of uncertain information data. I work out why we choose to represent the orientation part of a pose by a unit quaternion. The translation part is described either by a 3-dimensional vector or by a purely imaginary quaternion. This depends on whether we define the probability density function or whether we want to represent a transformation which consists of a rotation and a translation by a dual quaternion. A basic probability den- sity function over the poses is defined by a tangent point on the hypersphere and a 6-dimensional Gaussian distribution. The hypersphere is embedded to the R4 which is representing a unit quaternions whereas the Gaussian is defined over the product of the tangent space of the sphere and of the space of translations. The projection of this Gaussian to the hypersphere induces a distribution over poses in $S_3 \times \mathbb{R}^3$. The set of mixtures of projected Gaussians can approximate the probability density functions that arise in our application. Moreover it is closed under the operations introduced in this framework and allows for an efficient implementation.
A Geek's Guide to Machine Learning and Risk analytics and Decisioning Provenir
The greatest challenge when talking about artificial intelligence/machine learning is actually in understanding what data sets we are looking at, and what model/combination of models to apply. Amazon's Machine Learning offering is one example of an automated process which analyses the data and automatically selects the best model to use in the scenario. Other big players who have similar offerings are IBM Watson, Google and Microsoft. Provenir's clients are continually looking at new and innovative ways to improve their risk decisioning. Traditional banks offering consumer, SME and commercial loans and credit, auto lenders, payment providers and fintech companies are using Provenir technology to help them make faster and better decisions about potential fraud. Integrating artificial intelligence/machine learning capabilities into the risk decisioning process can increase the organization's ability to accurately assess the level of risk in order to detect and prevent fraud. Provenir provides model integration adaptors for machine learning models, including Amazon Machine Learning (AML) that can automatically listen for and label business-defined events, calculate attributes and update machine learning models. By combining Provenir technology with machine learning, organizations can increase both the efficiency and predictive accuracy of their risk decisioning.