Directed Networks
Reliable Discretization of Deterministic Equations in Bayesian Networks
Antonucci, Alessandro (Istituto Dalle Molle di Studi sullโIntelligenza Artificiale)
We focus on the problem of modeling deterministic equations over continuous variables in discrete Bayesian networks. This is typically achieved by a discretization of both input and output variables and a degenerate quantification of the corresponding conditional probability tables. This approach, based on classical probabilities, cannot properly model the information loss induced by the discretization. We show that a reliable modeling of such epistemic uncertainty can be instead achieved by credal sets, i.e., convex sets of probability mass functions. This transforms the original Bayesian network in a credal network, possibly returning interval-valued inferences, that are robust with respect to the information loss induced by the discretisation. Algorithmic strategies for an optimal choice of the discretisation bins are also provided.
Learning Behavioral Memory Representations from Observation
Wong, Josiah (University of Central Florida) | Gonzalez, Avelino J. (University of Central Florida)
Learning from Observation (LfO) is highly useful for modeling behaviors through nonintrusive observation of some actor's performance. However, an actor's performance is often influenced by unobservable internal influences, such as emotions, agendas, and memory of past events. Therefore, new techniques are needed to infer the structure of these influences and their effect on an actor's decisions. In this paper, we propose a novel approach called Memory Composition Learning (MCL) for capturing one internal influence: memory of past events. We hypothesize that memory influences on a behavior can be modeled through parameterized memory features that can be learned from observation of traces of an actor's behavior; these memory features can then be presented as additional input to a performance modeling application. We demonstrate the efficacy of our approach in a simulated vacuum cleaner domain and show that hidden memory influences can be detected, modeled, and then used to improve machine learning performance.
Influence-Based Independence
รzรงep, รzgรผr L. (University of Lรผbeck) | Kuhr, Felix (University of Lรผbeck) | Mรถller, Ralf (University of Lรผbeck)
Conditional independence structures describe independencies of one set of variables from another set of variables conditioned upon a third set of variables. These structures are invaluable means for compact representations of knowledge because independencies can be exploited for useful factorizations. Conditional independence structures appear in different disguise in various areas of knowledge representation, be it the conditional independence of sets of random variables in probabilistic graphical models such as Bayesian networks or as conditional functions related to belief revision, or as in- dependencies induced by (embedded) multivalued dependencies in data bases. This paper investigates conditional independencies for Boolean functions using Fourier analysis. We define three notions of independence based on the notion of influence of a variable on a function and draw connections to multivalued dependencies.
Hierarchical Classification With Bayesian Networks and Chained Classifiers
Serrano-Pรฉrez, Jonathan (Instituto Nacional de Astrofรญsica รptica y Electrรณnica) | Sucar, Luis Enrique (Instituto Nacional de Astrofรญsica รptica y Electrรณnica)
In this work is proposed a method for Hierarchical Classification, which takes advantage of the hierarchical structure to influence the prediction of local classifiers with their neighbors. To achieve this, two strategies are combined. The first is to represent the hierarchical structure as a Bayesian network, and the second is to build chained classifiers that feed the Bayesian network as local classifiers. The proposed method was tested in several datasets of functional genomics, which consist of tree-structured hierarchies. The results of several variants of the proposed method are compared to the standard methods, Flat and Top-Down, as well as with a start of the art technique, showing superior performance under several metrics.
Meta reinforcement learning as task inference
Humplik, Jan, Galashov, Alexandre, Hasenclever, Leonard, Ortega, Pedro A., Teh, Yee Whye, Heess, Nicolas
Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There has been considerable interest in designing reinforcement learning algorithms with similar properties. This includes several proposals to learn the learning algorithm itself, an idea also referred to as meta learning. One formal interpretation of this idea is in terms of a partially observable multi-task reinforcement learning problem in which information about the task is hidden from the agent. Although agents that solve partially observable environments can be trained from rewards alone, shaping an agent's memory with additional supervision has been shown to boost learning efficiency. It is thus natural to ask what kind of supervision, if any, facilitates meta-learning. Here we explore several choices and develop an architecture that separates learning of the belief about the unknown task from learning of the policy, and that can be used effectively with privileged information about the task during training. We show that this approach can be very effective at solving standard meta-RL environments, as well as a complex continuous control environment in which a simulated robot has to execute various movement sequences.
Output-Constrained Bayesian Neural Networks
Yang, Wanqian, Lorch, Lars, Graule, Moritz A., Srinivasan, Srivatsan, Suresh, Anirudh, Yao, Jiayu, Pradier, Melanie F., Doshi-Velez, Finale
Bayesian neural network (BNN) priors are defined in parameter space, making it hard to encode prior knowledge expressed in function space. We formulate a prior that incorporates functional constraints about what the output can or cannot be in regions of the input space. Output-Constrained BNNs (OC-BNN) represent an interpretable approach of enforcing a range of constraints, fully consistent with the Bayesian framework and amenable to black-box inference. We demonstrate how OC-BNNs improve model robustness and prevent the prediction of infeasible outputs in two real-world applications of healthcare and robotics.
Distribution Calibration for Regression
Song, Hao, Diethe, Tom, Kull, Meelis, Flach, Peter
We are concerned with obtaining well-calibrated output distributions from regression models. Such distributions allow us to quantify the uncertainty that the model has regarding the predicted target value. We introduce the novel concept of distribution calibration, and demonstrate its advantages over the existing definition of quantile calibration. We further propose a post-hoc approach to improving the predictions from previously trained regression models, using multi-output Gaussian Processes with a novel Beta link function. The proposed method is experimentally verified on a set of common regression models and shows improvements for both distribution-level and quantile-level calibration.
Information criteria for non-normalized models
Matsuda, Takeru, Uehara, Masatoshi, Hyvarinen, Aapo
Many statistical models are given in the form of non-normalized densities with an intractable normalization constant. Since maximum likelihood estimation is computationally intensive for these models, several estimation methods have been developed which do not require explicit computation of the normalization constant, such as noise contrastive estimation (NCE) and score matching. However, model selection methods for general non-normalized models have not been proposed so far. In this study, we develop information criteria for non-normalized models estimated by NCE or score matching. They are derived as approximately unbiased estimators of discrepancy measures for non-normalized models. Experimental results demonstrate that the proposed criteria enable selection of the appropriate non-normalized model in a data-driven manner. Extension to a finite mixture of non-normalized models is also discussed.
Evaluation of Machine Learning Algorithms for Intrusion Detection System
To gauge the accuracy of machine learning models we use various parameters. The metrics used here will be Average Accuracy, False Positive Rates and False Negative Rates. K-Means is excluded from this metric as it is an unsupervised algorithm. Average Accuracy is defined as the ratio of the correctly classified data points to the total number of data points. False Positives are those cases which were supposed to be returned as threats but aren't. False negatives are just the opposite.
Seismic Bayesian evidential learning: Estimation and uncertainty quantification of sub-resolution reservoir properties
Pradhan, Anshuman, Mukerji, Tapan
We present a framework that enables estimation of low-dimensional sub-resolution reservoir properties directly from seismic data, without requiring the solution of a high dimensional seismic inverse problem. Our workflow is based on the Bayesian evidential learning approach and exploits learning the direct relation between seismic data and reservoir properties to efficiently estimate reservoir properties. The theoretical framework we develop allows incorporation of non-linear statistical models for seismic estimation problems. Uncertainty quantification is performed with Approximate Bayesian Computation. With the help of a synthetic example of estimation of reservoir net-to-gross and average fluid saturations in sub-resolution thin-sand reservoir, several nuances are foregrounded regarding the applicability of unsupervised and supervised learning methods for seismic estimation problems. Finally, we demonstrate the efficacy of our approach by estimating posterior uncertainty of reservoir net-to-gross in sub-resolution thin-sand reservoir from an offshore delta dataset using 3D pre-stack seismic data.