Markov Models
Learning Restricted Boltzmann Machines with Arbitrary External Fields
We study the problem of learning graphical models with latent variables. We give the first algorithm for learning locally consistent (ferromagnetic or antiferromagnetic) Restricted Boltzmann Machines (or RBMs) with {\em arbitrary} external fields. Our algorithm has optimal dependence on dimension in the sample complexity and run time however it suffers from a sub-optimal dependency on the underlying parameters of the RBM. Prior results have been established only for {\em ferromagnetic} RBMs with {\em consistent} external fields (signs must be same)\cite{bresler2018learning}. The proposed algorithm strongly relies on the concavity of magnetization which does not hold in our setting. We show the following key structural property: even in the presence of arbitrary external field, for any two observed nodes that share a common latent neighbor, the covariance is high. This enables us to design a simple greedy algorithm that maximizes covariance to iteratively build the neighborhood of each vertex.
Amortized Bethe Free Energy Minimization for Learning MRFs
We propose to learn deep undirected graphical models (i.e., MRFs), with a non-ELBO objective for which we can calculate exact gradients. In particular, we optimize a saddle-point objective deriving from the Bethe free energy approximation to the partition function. Unlike much recent work in approximate inference, the derived objective requires no sampling, and can be efficiently computed even for very expressive MRFs. We furthermore amortize this optimization with trained inference networks. Experimentally, we find that the proposed approach compares favorably with loopy belief propagation, but is faster, and it allows for attaining better held out log likelihood than other recent approximate inference schemes.
Enhanced Input Modeling for Construction Simulation using Bayesian Deep Neural Networks
ABSTRACT This paper aims to propose a novel deep learning-integrated framework for deriving reliable simulation input models through incorporating multi-source information. The framework sources and extracts multisource data generated from construction operations, which provides rich information for input modeling. The framework implements Bayesian deep neural networks to facilitate the purpose of incorporating richer information in input modeling. A case study on road paving operation is performed to test the feasibility and applicability of the proposed framework. Overall, this research enhances input modeling by deriving detailed input models, thereby, augmenting the decision-making processes in construction operations.
Improving Prediction Accuracy in Building Performance Models Using Generative Adversarial Networks (GANs)
Chokwitthaya, Chanachok, Collier, Edward, Zhu, Yimin, Mukhopadhyay, Supratik
Building performance discrepancies between building design and operation are one of the causes that lead many new designs fail to achieve their goals and objectives. One of main factors contributing to the discrepancy is occupant behaviors. Occupants responding to a new design are influenced by several factors. Existing building performance models (BPMs) ignore or partially address those factors (called contextual factors) while developing BPMs. To potentially reduce the discrepancies and improve the prediction accuracy of BPMs, this paper proposes a computational framework for learning mixture models by using Generative Adversarial Networks (GANs) that appropriately combining existing BPMs with knowledge on occupant behaviors to contextual factors in new designs. Immersive virtual environments (IVEs) experiments are used to acquire data on such behaviors. Performance targets are used to guide appropriate combination of existing BPMs with knowledge on occupant behaviors. The resulting model obtained is called an augmented BPM. Two different experiments related to occupant lighting behaviors are shown as case study. The results reveal that augmented BPMs significantly outperformed existing BPMs with respect to achieving specified performance targets. The case study confirms the potential of the computational framework for improving prediction accuracy of BPMs during design.
Deep Recurrent Adversarial Learning for Privacy-Preserving Smart Meter Data Release
Shateri, Mohammadhadi, Messina, Francisco, Piantanida, Pablo, Labeau, Fabrice
Smart Meters (SMs) are an important component of smart electrical grids, but they have also generated serious concerns about privacy data of consumers. In this paper, we present a general formulation of the privacy-preserving problem in SMs from an information-theoretic perspective. In order to capture the casual time series structure of the power measurements, we employ Directed Information (DI) as an adequate measure of privacy. On the other hand, to cope with a variety of potential applications of SMs data, we study different distortion measures along with the standard squared-error distortion. This formulation leads to a quite general training objective (or loss) which is optimized under a deep learning adversarial framework where two Recurrent Neural Networks (RNNs), referred to as the releaser and the attacker, are trained with opposite goals. An exhaustive empirical study is then performed to validate the proposed approach for different privacy problems in three actual data sets. Finally, we study the impact of the data mismatch problem, which occurs when the releaser and the attacker have different training data sets and show that privacy may not require a large level of distortion in real-world scenarios.
Augmenting Neural Networks with First-order Logic
Today, the dominant paradigm for training neural networks involves minimizing task loss on a large dataset. Using world knowledge to inform a model, and yet retain the ability to perform end-to-end training remains an open question. In this paper, we present a novel framework for introducing declarative knowledge to neural network architectures in order to guide training and prediction. Our framework systematically compiles logical statements into computation graphs that augment a neural network without extra learnable parameters or manual redesign. We evaluate our modeling strategy on three tasks: machine comprehension, natural language inference, and text chunking. Our experiments show that knowledge-augmented networks can strongly improve over baselines, especially in low-data regimes.
Anomaly Detection with HMM Gauge Likelihood Analysis
Lorbeer, Boris, Deutsch, Tanja, Ruppel, Peter, Kรผpper, Axel
This paper describes a new method, HMM gauge likelihood analysis, or GLA, of detecting anomalies in discrete time series using Hidden Markov Models and clustering. At the center of the method lies the comparison of subsequences. To achieve this, they first get assigned to their Hidden Markov Models using the Baum-Welch algorithm. Next, those models are described by an approximating representation of the probability distributions they define. Finally, this representation is then analyzed with the help of some clustering technique or other outlier detection tool and anomalies are detected. Clearly, HMMs could be substituted by some other appropriate model, e.g. some other dynamic Bayesian network. Our learning algorithm is unsupervised, so it does not require the labeling of large amounts of data. The usability of this method is demonstrated by applying it to synthetic and real-world syslog data.
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
We present an algorithm based on the Optimism in the Face of Uncertainty (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently. By evaluating the state-pair difference of the optimal bias function $h^{*}$, the proposed algorithm achieves a regret bound of $\tilde{O}(\sqrt{SAHT})$for MDP with $S$ states and $A$ actions, in the case that an upper bound $H$ on the span of $h^{*}$, i.e., $sp(h^{*})$ is known. This result outperforms the best previous regret bounds $\tilde{O}(HS\sqrt{AT})$ [Bartlett and Tewari, 2009] by a factor of $\sqrt{SH}$. Furthermore, this regret bound matches the lower bound of $\Omega(\sqrt{SAHT})$ [Jaksch et al., 2010] up to a logarithmic factor. As a consequence, we show that there is a near optimal regret bound of $\tilde{O}(\sqrt{SADT})$ for MDPs with finite diameter $D$ compared to the lower bound of $\Omega(\sqrt{SADT})$ [Jaksch et al., 2010].
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Exploration in reinforcement learning (RL) suffers from the curse of dimensionality when the state-action space is large. A common practice is to parameterize the high-dimensional value and policy functions using given features. However existing methods either have no theoretical guarantee or suffer a regret that is exponential in the planning horizon $H$. In this paper, we propose an online RL algorithm, namely the MatrixRL, that leverages ideas from linear bandit to learn a low-dimensional representation of the probability transition model while carefully balancing the exploitation-exploration tradeoff. We show that MatrixRL achieves a regret bound ${O}\big(H^2d\log T\sqrt{T}\big)$ where $d$ is the number of features. MatrixRL has an equivalent kernelized version, which is able to work with an arbitrary kernel Hilbert space without using explicit features. In this case, the kernelized MatrixRL satisfies a regret bound ${O}\big(H^2\widetilde{d}\log T\sqrt{T}\big)$, where $\widetilde{d}$ is the effective dimension of the kernel space. To our best knowledge, for RL using features or kernels, our results are the first regret bounds that are near-optimal in time $T$ and dimension $d$ (or $\widetilde{d}$) and polynomial in the planning horizon $H$.
Empirical Bayes Method for Boltzmann Machines
Yasuda, Muneki, Obuchi, Tomoyuki
In this study, we consider an empirical Bayes method for Boltzmann machines and propose an algorithm for it. The empirical Bayes method allows estimation of the values of the hyperparameters of the Boltzmann machine by maximizing a specific likelihood function referred to as the empirical Bayes likelihood function in this study. However, the maximization is computationally hard because the empirical Bayes likelihood function involves intractable integrations of the partition function. The proposed algorithm avoids this computational problem by using the replica method and the Plefka expansion. Our method does not require any iterative procedures and is quite simple and fast, though it introduces a bias to the estimate, which exhibits an unnatural behavior with respect to the size of the dataset. This peculiar behavior is supposed to be due to the approximate treatment by the Plefka expansion. A possible extension to overcome this behavior is also discussed.