Bayesian Inference
Destination Prediction by Trajectory Distribution Based Model
Besse, Philippe C., Guillouet, Brendan, Loubes, Jean-Michel, Royer, Francois
ONITORING and predicting road traffic is of great importance for traffic managers. With the increase of mobile sensors, such as GPS devices and smartphones, much information is at hand to understand urban traffic. In the last few years, a large amount of research has been conducted in order to use this data to model and analyze road traffic conditions. The aim of this paper is to tackle the issue of predicting the destination of vehicles given a prefix of their trajectory. This problem has been the subject of a Kaggle challenge entitled "ECML/PKDD 15: Taxi Trajectory Prediction (I)" [1]. The observations are time-stamped locations that correspond to the different positions of vehicles moving within a city monitored at different observation times. When dealing with a dataset composed of trajectories, the difficulty lies in the fact that the data convey both spatial information (locations of the vehicles on the map of the city) and temporal information (for each vehicle, the locations are indexed by time, which creates a sequence of locations that compose a full trajectory). Hence the data have a spatiotemporal structure that must be taken into account in order to model their evolution while the trajectories of the destination points to be predicted are unknown. Vehicle trajectories are also constrained to a road network which makes their time progression very irregular.
A Bayesian approach to constrained single- and multi-objective optimization
Feliot, Paul, Bect, Julien, Vazquez, Emmanuel
This article addresses the problem of derivative-free (single- or multi-objective) optimization subject to multiple inequality constraints. Both the objective and constraint functions are assumed to be smooth, non-linear and expensive to evaluate. As a consequence, the number of evaluations that can be used to carry out the optimization is very limited, as in complex industrial design optimization problems. The method we propose to overcome this difficulty has its roots in both the Bayesian and the multi-objective optimization literatures. More specifically, an extended domination rule is used to handle objectives and constraints in a unified way, and a corresponding expected hyper-volume improvement sampling criterion is proposed. This new criterion is naturally adapted to the search of a feasible point when none is available, and reduces to existing Bayesian sampling criteria---the classical Expected Improvement (EI) criterion and some of its constrained/multi-objective extensions---as soon as at least one feasible point is available. The calculation and optimization of the criterion are performed using Sequential Monte Carlo techniques. In particular, an algorithm similar to the subset simulation method, which is well known in the field of structural reliability, is used to estimate the criterion. The method, which we call BMOO (for Bayesian Multi-Objective Optimization), is compared to state-of-the-art algorithms for single- and multi-objective constrained optimization.
Bayesian Network-Based Extension for PGP — Estimating Petition Support
Silaghi, Marius (Florida Institute of Technology) | Qin, Song (Florida Institute of Technology) | Matsui, Toshihiro (Nagoya Institute of Technology) | Yokoo, Makoto (Kyushu University)
Consider the problem of estimating the expected number of distinct eligible voters among the authors of a set of electronic signatures gathered for a petition (or citizen initiative) that has to pass legally required thresholds. We formalize this problem and propose an extension to the Pretty Good Privacy Web Of Trust, a mechanism for reciprocally certifying identities between peers. The extension (a) enables agents to certify additional relevant statements about others, and (b) gives agents opportunities for negative authentication statements (e.g., on ineligibility of an identity). A Bayesian Network model enables inferences on the data provided by the proposed PGP extension. Simulations and an agent-based platform are used to validate the concepts.
A Noisy-OR Model for Continuous Time Bayesian Networks
Perreault, Logan (Montana State University) | Strasser, Shane (Montana State University) | Thornton, Monica (Montana State University) | Sheppard, John (Montana State University)
A continuous time Bayesian network is a graphical model capable of describing discrete state systems that evolve in continuous time. Unfortunately, the number of parameters required for each node in the graph is exponential in the number of parents of the node, which can be prohibitively large for many real-world systems. To mitigate this problem, we propose a Noisy-OR model for continuous time Bayesian networks, which can reduce the number of required parameters from exponential to linear. We describe the model, as well as the process required to compute the remaining unspecified parameters. Finally, we experimentally validate the correctness of the proposed Noisy-OR formulation.
Bayesian Networks with Conditional Truncated Densities
Cortijo, Santiago (LIP6 - UPMC) | Gonzales, Christophe (LIP6 - UPMC )
The majority of Bayesian networks learning and inference algorithms rely on the assumption that all random variables are discrete, which is not necessarily the case in real-world problems. In situations where some variables are continuous, a trade-off between the expressive power of the model and the computational complexity of inference has to be done: on one hand, conditional Gaussian models are computationally efficient but they lack expressive power; on the other hand, mixtures of exponentials (MTE), bases or polynomials are expressive but this comes at the expense of tractability. In this paper, we propose an alternative model that lies in between. It is composed of a "discrete" Bayesian network (BN) combined with a set of monodimensional conditional truncated densities modeling the uncertainty over the continuous random variables given their discrete counterpart resulting from a discretization process. We show that inference computation times in this new model are close to those in discrete BNs. Experiments confirm the tractability of the model and highlight its expressive power by comparing it with MTE.
Testing Independencies in Bayesian Networks with i-Separation
Butz, Cory J. (University of Regina) | Santos, André E. dos (University of Regina) | Oliveira, Jhonatan S. (University of Regina) | Gonzales, Christophe ( Université Pierre et Marie Curie )
Testing independencies in Bayesian networks (BNs) is a fundamental task in probabilistic reasoning. In this paper, we propose inaugural-separation (i-separation) as a new method for testing independencies in BNs. We establish the correctness of i-separation. Our method has several theoretical and practical advantages. There are at least five ways in which i-separation is simpler than d-separation, the classical method for testing independencies in BNs, of which the most important is that "blocking" works in an intuitive fashion. In practice, our empirical evaluation shows that i-separation tends to be faster than d-separation in large BNs.
Bayesian Network Inference with Simple Propagation
Butz, Cory J. (University of Regina) | Oliveira, Jhonatan S. (University of Regina) | Santos, André E. dos (University of Regina) | Madsen, Anders L. (HUGIN Expert A/S and Aalborg University)
We propose Simple Propagation (SP) as a new join tree propagation algorithm for exact inference in discrete Bayesian networks. We establish the correctness of SP. The striking feature of SP is that its message construction exploits the factorization of potentials at a sending node, but without the overhead of building and examining graphs as done in Lazy Propagation (LP). Experimental results on numerous benchmark Bayesian networks show that SP is often faster than LP.
A summary on Maximum likelihood Estimator
A general method of building a predictive model requires least square estimation at first. Then we need work on the residuals, find the confidence interval of parameters and test how well the model fits the data which are based on the normally distributed assumption of the residuals (or noises). But unfortunately the assumption is not guaranteed. Most of the time, you will have a graph of residuals that looks like another distribution rather than the normal. At this moment you could add one more factor term to your model so as to filter out the non-normal distributed noise, and then calculate the LSE again.
Distributed Learning with Infinitely Many Hypotheses
Nedić, Angelia, Olshevsky, Alex, Uribe, César
We consider a distributed learning setup where a network of agents sequentially access realizations of a set of random variables with unknown distributions. The network objective is to find a parametrized distribution that best describes their joint observations in the sense of the Kullback-Leibler divergence. Apart from recent efforts in the literature, we analyze the case of countably many hypotheses and the case of a continuum of hypotheses. We provide non-asymptotic bounds for the concentration rate of the agents' beliefs around the correct hypothesis in terms of the number of agents, the network parameters, and the learning abilities of the agents. Additionally, we provide a novel motivation for a general set of distributed Non-Bayesian update rules as instances of the distributed stochastic mirror descent algorithm.
Provable Bayesian Inference via Particle Mirror Descent
Dai, Bo, He, Niao, Dai, Hanjun, Song, Le
Bayesian methods are appealing in their flexibility in modeling complex data and ability in capturing uncertainty in parameters. However, when Bayes' rule does not result in tractable closed-form, most approximate inference algorithms lack either scalability or rigorous guarantees. To tackle this challenge, we propose a simple yet provable algorithm, \emph{Particle Mirror Descent} (PMD), to iteratively approximate the posterior density. PMD is inspired by stochastic functional mirror descent where one descends in the density space using a small batch of data points at each iteration, and by particle filtering where one uses samples to approximate a function. We prove result of the first kind that, with $m$ particles, PMD provides a posterior density estimator that converges in terms of $KL$-divergence to the true posterior in rate $O(1/\sqrt{m})$. We demonstrate competitive empirical performances of PMD compared to several approximate inference algorithms in mixture models, logistic regression, sparse Gaussian processes and latent Dirichlet allocation on large scale datasets.