Directed Networks
Fast Approximate Bayesian Computation for Estimating Parameters in Differential Equations
Ghosh, Sanmitra, Dasmahapatra, Srinandan, Maharatna, Koushik
Approximate Bayesian computation (ABC) using a sequential Monte Carlo method provides a comprehensive platform for parameter estimation, model selection and sensitivity analysis in differential equations. However, this method, like other Monte Carlo methods, incurs a significant computational cost as it requires explicit numerical integration of differential equations to carry out inference. In this paper we propose a novel method for circumventing the requirement of explicit integration by using derivatives of Gaussian processes to smooth the observations from which parameters are estimated. We evaluate our methods using synthetic data generated from model biological systems described by ordinary and delay differential equations. Upon comparing the performance of our method to existing ABC techniques, we demonstrate that it produces comparably reliable parameter estimates at a significantly reduced execution time.
Scalable Gaussian Process Classification via Expectation Propagation
Hernรกndez-Lobato, Daniel, Hernรกndez-Lobato, Josรฉ Miguel
Variational methods have been recently considered for scaling the training process of Gaussian process classifiers to large datasets. As an alternative, we describe here how to train these classifiers efficiently using expectation propagation. The proposed method allows for handling datasets with millions of data instances. More precisely, it can be used for (i) training in a distributed fashion where the data instances are sent to different nodes in which the required computations are carried out, and for (ii) maximizing an estimate of the marginal likelihood using a stochastic approximation of the gradient. Several experiments indicate that the method described is competitive with the variational approach.
On the Convergence of Stochastic Variational Inference in Bayesian Networks
We highlight a pitfall when applying stochastic variational inference to general Bayesian networks. For global random variables approximated by an exponential family distribution, natural gradient steps, commonly starting from a unit length step size, are averaged to convergence. This useful insight into the scaling of initial step sizes is lost when the approximation factorizes across a general Bayesian network, and care must be taken to ensure practical convergence. We experimentally investigate how much of the baby (well-scaled steps) is thrown out with the bath water (exact gradients).
How to Center Binary Deep Boltzmann Machines
Melchior, Jan, Fischer, Asja, Wiskott, Laurenz
This work analyzes centered binary Restricted Boltzmann Machines (RBMs) and binary Deep Boltzmann Machines (DBMs), where centering is done by subtracting offset values from visible and hidden variables. We show analytically that (i) centering results in a different but equivalent parameterization for artificial neural networks in general, (ii) the expected performance of centered binary RBMs/DBMs is invariant under simultaneous flip of data and offsets, for any offset value in the range of zero to one, (iii) centering can be reformulated as a different update rule for normal binary RBMs/DBMs, and (iv) using the enhanced gradient is equivalent to setting the offset values to the average over model and data mean. Furthermore, numerical simulations suggest that (i) optimal generative performance is achieved by subtracting mean values from visible as well as hidden variables, (ii) centered RBMs/DBMs reach significantly higher log-likelihood values than normal binary RBMs/DBMs, (iii) centering variants whose offsets depend on the model mean, like the enhanced gradient, suffer from severe divergence problems, (iv) learning is stabilized if an exponentially moving average over the batch means is used for the offset values instead of the current batch mean, which also prevents the enhanced gradient from diverging, (v) centered RBMs/DBMs reach higher LL values than normal RBMs/DBMs while having a smaller norm of the weight matrix, (vi) centering leads to an update direction that is closer to the natural gradient and that the natural gradient is extremly efficient for training RBMs, (vii) centering dispense the need for greedy layer-wise pre-training of DBMs, (viii) furthermore we show that pre-training often even worsen the results independently whether centering is used or not, and (ix) centering is also beneficial for auto encoders.
Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks
Hernรกndez-Lobato, Josรฉ Miguel, Adams, Ryan P.
Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.
Learning Behaviors in Agents Systems with Interactive Dynamic Influence Diagrams
Conroy, Ross (Teesside University) | Zeng, Yifeng (Teesside University) | Cavazza, Marc (Teesside University) | Chen, Yingke (University of Georgia)
Interactive dynamic influence diagrams(I-DIDs) are a well recognized decision model that explicitly considers how multiagent interaction affects individual decision making. To predict behavior of other agents, I-DIDs require models of the other agents to be known ahead of time and manually encoded. This becomes a barrier to I-DID applications in a human-agent interaction setting, such as development of intelligent non-player characters(NPCs) in real-time strategy(RTS) games, where models of other agents or human players are often inaccessible to domain experts. In this paper, we use automatic techniques for learning behavior of other agents from replay data in RTS games. We propose a learning algorithm with improvement over existing work by building a full profile of agent behavior. This is the first time that data-driven learning techniques are embedded into the I-DID decision making framework. We evaluate the performance of our approach on two test cases.
Adaptive Dropout Rates for Learning with Corrupted Features
Zhuo, Jingwei (Tsinghua University) | Zhu, Jun (Tsinghua University) | Zhang, Bo (Tsinghua University)
Feature noising is an effective mechanism on reducing the risk of overfitting. To avoid an explosive searching space, existing work typically assumes that all features share a single noise level, which is often cross-validated. In this paper, we present a Bayesian feature noising model that flexibly allows for dimension-specific or group-specific noise levels, and we derive a learning algorithm that adaptively updates these noise levels. Our adaptive rule is simple and interpretable, by drawing a direct connection to the fitness of each individual feature or feature group. Empirical results on various datasets demonstrate the effectiveness on avoiding extensive tuning and sometimes improving the performance due to its flexibility.
CoBots: Robust Symbiotic Autonomous Mobile Service Robots
Veloso, Manuela (Carnegie Mellon University) | Biswas, Joydeep (Carnegie Mellon University) | Coltin, Brian (Carnegie Mellon University) | Rosenthal, Stephanie (Carnegie Mellon University)
We research and develop autonomous mobile service robots as Collaborative Robots, i.e., CoBots. For the last three years, our four CoBots have autonomously navigated in our multi-floor office buildings for more than 1,000km, as the result of the integration of multiple perceptual, cognitive, and actuations representations and algorithms. In this paper, we identify a few core aspects of our CoBots underlying their robust functionality. The reliable mobility in the varying indoor environments comes from a novel episodic non-Markov localization. Service tasks requested by users are the input to a scheduler that can consider different types of constraints, including transfers among multiple robots. With symbiotic autonomy, the CoBots proactively seek external sources of help to fill-in for their inevitable occasional limitations. We present sampled results from a deployment and conclude with a brief review of other features of our service robots.
Exploiting Trust Information to Cope with Malicious Entities in Multi-Agent Systems
Irissappane, Athirai A. (Nanyang Technological University)
Our research is within the area of artificial intelligence and multi-agent systems. More specifically, we focus on evaluating trust relationships between the agents in multi-agent e-marketplaces and sensor networks and aim to address the following problems: 1) how to identify a trustworthy (good quality) agent; 2) how to cope with dishonest advisors i.e., agents who provide misleading opinions about others.
Inapproximability of Treewidth and Related Problems (Extended Abstract)
Wu, Yu (Facebook AI Research Lab) | Austrin, Per (KTH Royal Insititute of Technology) | Pitassi, Toniann (University of Toronto) | Liu, David (University of Toronto)
Graphical models, such as Bayesian Networks and Markov networks play an important role in artificial intelligence and machine learning. Inference is a central problem to be solved on these networks. This, and other problems on these graph models are often known to be hard to solve in general, but tractable on graphs with bounded Treewidth. Therefore, finding or approximating the Treewidth of a graph is a fundamental problem related to inference in graphical models. In this paper, we study the approximability of a number of graph problems: Treewidth and Pathwidth of graphs, Minimum Fill-In, and a variety of different graph layout problems such as Minimum Cut Linear Arrangement. We show that, assuming Small Set Expansion Conjecture, all of these problems are NP-hard to approx- imate to within any constant factor in polynomial time.