Undirected Networks
Solving Non-parametric Inverse Problem in Continuous Markov Random Field using Loopy Belief Propagation
In this paper, we address the inverse problem, or the statistical machine learning problem, in Markov random fields with a non-parametric pair-wise energy function with continuous variables. The inverse problem is formulated by maximum likelihood estimation. The exact treatment of maximum likelihood estimation is intractable because of two problems: (1) it includes the evaluation of the partition function and (2) it is formulated in the form of functional optimization. We avoid Problem (1) by using Bethe approximation. Bethe approximation is an approximation technique equivalent to the loopy belief propagation. Problem (2) can be solved by using orthonormal function expansion. Orthonormal function expansion can reduce a functional optimization problem to a function optimization problem. Our method can provide an analytic form of the solution of the inverse problem within the framework of Bethe approximation.
Fast Optimization of Wildfire Suppression Policies with SMAC
McGregor, Sean, Houtman, Rachel, Montgomery, Claire, Metoyer, Ronald, Dietterich, Thomas G.
Managers of US National Forests must decide what policy to apply for dealing with lightning-caused wildfires. Conflicts among stakeholders (e.g., timber companies, home owners, and wildlife biologists) have often led to spirited political debates and even violent eco-terrorism. One way to transform these conflicts into multi-stakeholder negotiations is to provide a high-fidelity simulation environment in which stakeholders can explore the space of alternative policies and understand the tradeoffs therein. Such an environment needs to support fast optimization of MDP policies so that users can adjust reward functions and analyze the resulting optimal policies. This paper assesses the suitability of SMAC---a black-box empirical function optimization algorithm---for rapid optimization of MDP policies. The paper describes five reward function components and four stakeholder constituencies. It then introduces a parameterized class of policies that can be easily understood by the stakeholders. SMAC is applied to find the optimal policy in this class for the reward functions of each of the stakeholder constituencies. The results confirm that SMAC is able to rapidly find good policies that make sense from the domain perspective. Because the full-fidelity forest fire simulator is far too expensive to support interactive optimization, SMAC is applied to a surrogate model constructed from a modest number of runs of the full-fidelity simulator. To check the quality of the SMAC-optimized policies, the policies are evaluated on the full-fidelity simulator. The results confirm that the surrogate values estimates are valid. This is the first successful optimization of wildfire management policies using a full-fidelity simulation. The same methodology should be applicable to other contentious natural resource management problems where high-fidelity simulation is extremely expensive.
Blankets Joint Posterior score for learning Markov network structures
Schlüter, Federico, Strappa, Yanela, Milone, Diego H., Bromberg, Facundo
Markov networks are extensively used to model complex sequential, spatial, and relational interactions in a wide range of fields. By learning the structure of independences of a domain, more accurate joint probability distributions can be obtained for inference tasks or, more directly, for interpreting the most significant relations among the variables. Recently, several researchers have investigated techniques for automatically learning the structure from data by obtaining the probabilistic maximum-a-posteriori structure given the available data. However, all the approximations proposed decompose the posterior of the whole structure into local sub-problems, by assuming that the posteriors of the Markov blankets of all the variables are mutually independent. In this work, we propose a scoring function for relaxing such assumption. The Blankets Joint Posterior score computes the joint posterior of structures as a joint distribution of the collection of its Markov blankets. Essentially, the whole posterior is obtained by computing the posterior of the blanket of each variable as a conditional distribution that takes into account information from other blankets in the network. We show in our experimental results that the proposed approximation can improve the sample complexity of state-of-the-art scores when learning complex networks, where the independence assumption between blanket variables is clearly incorrect.
Rejection-free Ensemble MCMC with applications to Factorial Hidden Markov Models
Märtens, Kaspar, Titsias, Michalis K, Yau, Christopher
Bayesian inference for complex models is challenging due to the need to explore high-dimensional spaces and multimodality and standard Monte Carlo samplers can have difficulties effectively exploring the posterior. We introduce a general purpose rejection-free ensemble Markov Chain Monte Carlo (MCMC) technique to improve on existing poorly mixing samplers. This is achieved by combining parallel tempering and an auxiliary variable move to exchange information between the chains. We demonstrate this ensemble MCMC scheme on Bayesian inference in Factorial Hidden Markov Models. This high-dimensional inference problem is difficult due to the exponentially sized latent variable space. Existing sampling approaches mix slowly and can get trapped in local modes. We show that the performance of these samplers is improved by our rejection-free ensemble technique and that the method is attractive and "easy-to-use" since no parameter tuning is required.
Inverse Reinforcement Learning in Swarm Systems
Šošić, Adrian, KhudaBukhsh, Wasiur R., Zoubir, Abdelhak M., Koeppl, Heinz
Inverse reinforcement learning (IRL) has become a useful tool for learning behavioral models from demonstration data. However, IRL remains mostly unexplored for multi-agent systems. In this paper, we show how the principle of IRL can be extended to homogeneous large-scale problems, inspired by the collective swarming behavior of natural systems. In particular, we make the following contributions to the field: 1) We introduce the swarMDP framework, a sub-class of decentralized partially observable Markov decision processes endowed with a swarm characterization. 2) Exploiting the inherent homogeneity of this framework, we reduce the resulting multi-agent IRL problem to a single-agent one by proving that the agent-specific value functions in this model coincide. 3) To solve the corresponding control problem, we propose a novel heterogeneous learning scheme that is particularly tailored to the swarm setting. Results on two example systems demonstrate that our framework is able to produce meaningful local reward models from which we can replicate the observed global system dynamics.
A Survey of Available Corpora for Building Data-Driven Dialogue Systems
Serban, Iulian Vlad, Lowe, Ryan, Henderson, Peter, Charlin, Laurent, Pineau, Joelle
During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and quite promising. To facilitate research in this area, we have carried out a wide survey of publicly available datasets suitable for data-driven learning of dialogue systems. We discuss important characteristics of these datasets, how they can be used to learn diverse dialogue strategies, and their other potential uses. We also examine methods for transfer learning between datasets and the use of external knowledge. Finally, we discuss appropriate choice of evaluation metrics for the learning objective.
Learning to Generate Samples from Noise through Infusion Training
Bordes, Florian, Honari, Sina, Vincent, Pascal
In this work, we investigate a novel training procedure to learn a generative model as the transition operator of a Markov chain, such that, when applied repeatedly on an unstructured random noise sample, it will denoise it into a sample that matches the target distribution from the training set. The novel training procedure to learn this progressive denoising operation involves sampling from a slightly different chain than the model chain used for generation in the absence of a denoising target. In the training chain we infuse information from the training target example that we would like the chains to reach with a high probability. The thus learned transition operator is able to produce quality and varied samples in a small number of steps. Experiments show competitive results compared to the samples generated with a basic Generative Adversarial Net
An Automated Auto-encoder Correlation-based Health-Monitoring and Prognostic Method for Machine Bearings
Hasani, Ramin M., Wang, Guodong, Grosu, Radu
This paper studies an intelligent ultimate technique for health-monitoring and prognostic of common rotary machine components, particularly bearings. During a run-to-failure experiment, rich unsupervised features from vibration sensory data are extracted by a trained sparse auto-encoder. Then, the correlation of the extracted attributes of the initial samples (presumably healthy at the beginning of the test) with the succeeding samples is calculated and passed through a moving-average filter. The normalized output is named auto-encoder correlation-based (AEC) rate which stands for an informative attribute of the system depicting its health status and precisely identifying the degradation starting point. We show that AEC technique well-generalizes in several run-to-failure tests. AEC collects rich unsupervised features form the vibration data fully autonomous. We demonstrate the superiority of the AEC over many other state-of-the-art approaches for the health monitoring and prognostic of machine bearings.
Robot Knows the Right Question to Ask When It's Confused
Last week, we (and most of the rest of the internet) covered some research from MIT that uses a brain interface to help robots correct themselves when they're about to make a mistake. This is very cool, very futuristic stuff, but it only works if you wear a very, very silly hat that can classify your brain waves in 10 milliseconds flat. At Brown University, researchers in Stefanie Tellex's lab are working on a more social approach to helping robots more accurately interact with humans. By enabling a robot to model its own confusion in an interactive object-fetching task, the robot can ask relevant clarifying questions when necessary to help understand exactly what humans want. Whether you ask a human or a robot to fetch you an object, it's a simple task to perform if the object is unique in some way, and a more complicated task to perform if it involves several similar objects.
Sequential Local Learning for Latent Graphical Models
Park, Sejun, Yang, Eunho, Shin, Jinwoo
Sejun Park Eunho Y ang † Jinwoo Shin November 4, 2017 Abstract Learning parameters of latent graphical models (GM) is inherently much harder than that of no-latent ones since the latent variables make the corresponding log-likelihood non-concave. Nevertheless, expectation-maximization schemes are popularly used in practice, but they are typically stuck in local optima. In the recent years, the method of moments have provided a refreshing angle for resolving the non-convex issue, but it is applicable to a quite limited class of latent GMs. In this paper, we aim for enhancing its power via enlarging such a class of latent GMs. To this end, we introduce two novel concepts, coined marginalization and conditioning, which can reduce the problem of learning a larger GM to that of a smaller one. More importantly, they lead to a sequential learning framework that repeatedly increases the learning portion of given latent GM, and thus covers a significantly broader and more complicated class of loopy latent GMs which include convolutional and random regular models. 1 Introduction Graphical models (GM) are succinct representation of a joint distribution on a graph where each node corresponds to a random variable and each edge represents the conditional independence between random variables. GM have been successfully applied for various fields including information theory [12, 19], physics [24] and machine learning [18, 11]. Introducing latent variables to GM has been popular approaches for enhancing their representation powers in recent deep models, e.g., convolutional/restricted/deep Boltzmann machines [20, 27]. Furthermore, they are inevitable in certain scenarios when a part of samples is missing, e.g., see [10]. However, learning parameters of latent GMs is significantly harder than that of no-latent ones since the latent variables make the corresponding negative log-likelihood non-convex.