Goto

Collaborating Authors

 Undirected Networks


Stochastic Learning for Sparse Discrete Markov Random Fields with Controlled Gradient Approximation Error

arXiv.org Machine Learning

We study the $L_1$-regularized maximum likelihood estimator/estimation (MLE) problem for discrete Markov random fields (MRFs), where efficient and scalable learning requires both sparse regularization and approximate inference. To address these challenges, we consider a stochastic learning framework called stochastic proximal gradient (SPG; Honorio 2012a, Atchade et al. 2014,Miasojedow and Rejchel 2016). SPG is an inexact proximal gradient algorithm [Schmidtet al., 2011], whose inexactness stems from the stochastic oracle (Gibbs sampling) for gradient approximation - exact gradient evaluation is infeasible in general due to the NP-hard inference problem for discrete MRFs [Koller and Friedman, 2009]. Theoretically, we provide novel verifiable bounds to inspect and control the quality of gradient approximation. Empirically, we propose the tighten asymptotically (TAY) learning strategy based on the verifiable bounds to boost the performance of SPG.


System-Level Predictive Maintenance: Review of Research Literature and Gap Analysis

arXiv.org Artificial Intelligence

This paper reviews current literature in the field of predictive maintenance from the system point of view. We differentiate the existing capabilities of condition estimation and failure risk forecasting as currently applied to simple components, from the capabilities needed to solve the same tasks for complex assets. System-level analysis faces more complex latent degradation states, it has to comprehensively account for active maintenance programs at each component level and consider coupling between different maintenance actions, while reflecting increased monetary and safety costs for system failures. As a result, methods that are effective for forecasting risk and informing maintenance decisions regarding individual components do not readily scale to provide reliable sub-system or system level insights. A novel holistic modeling approach is needed to incorporate available structural and physical knowledge and naturally handle the complexities of actively fielded and maintained assets.


Prior choice affects ability of Bayesian neural networks to identify unknowns

arXiv.org Artificial Intelligence

Deep Bayesian neural networks (BNNs) are a powerful tool, though computationally demanding, to perform parameter estimation while jointly estimating uncertainty around predictions. BNNs are typically implemented using arbitrary normal-distributed prior distributions on the model parameters. Here, we explore the effects of different prior distributions on classification tasks in BNNs and evaluate the evidence supporting the predictions based on posterior probabilities approximated by Markov Chain Monte Carlo sampling and by computing Bayes factors. We show that the choice of priors has a substantial impact on the ability of the model to confidently assign data to the correct class (true positive rates). Prior choice also affects significantly the ability of a BNN to identify out-of-distribution instances as unknown (false positive rates). When comparing our results against neural networks (NN) with Monte Carlo dropout we found that BNNs generally outperform NNs. Finally, in our tests we did not find a single best choice as prior distribution. Instead, each dataset yielded the best results under a different prior, indicating that testing alternative options can improve the performance of BNNs.


Delay-Aware Model-Based Reinforcement Learning for Continuous Control

arXiv.org Artificial Intelligence

Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the learned system models without learning effort. Experiments with the Gym and MuJoCo platforms show that the proposed delay-aware model-based algorithm is more efficient in training and transferable between systems with various durations of delay compared with off-policy model-free reinforcement learning methods. Codes available at: https://github.com/baimingc/dambrl.


Maximal Algorithmic Caliber and Algorithmic Causal Network Inference: General Principles of Real-World General Intelligence?

arXiv.org Artificial Intelligence

Ideas and formalisms from far-from-equilibrium thermodynamics are ported to the context of stochastic computational processes, via following and extending Tadaki's algorithmic thermodynamics. A Principle of Maximum Algorithmic Caliber is proposed, providing guidance as to what computational processes one should hypothesize if one is provided constraints to work within. It is conjectured that, under suitable assumptions, computational processes obeying algorithmic Markov conditions will maximize algorithmic caliber. It is proposed that in accordance with this, real-world cognitive systems may operate in substantial part by modeling their environments and choosing their actions to be (approximate and compactly represented) algorithmic Markov networks. These ideas are suggested as potential early steps toward a general theory of the operation of pragmatic generally intelligent systems.


Posterior Control of Blackbox Generation

arXiv.org Artificial Intelligence

Text generation often requires high-precision output that obeys task-specific rules. This fine-grained control is difficult to enforce with off-the-shelf deep learning models. In this work, we consider augmenting neural generation models with discrete control states learned through a structured latent-variable approach. Under this formulation, task-specific knowledge can be encoded through a range of rich, posterior constraints that are effectively trained into the model. This approach allows users to ground internal model decisions based on prior knowledge, without sacrificing the representational power of neural generative models. Experiments consider applications of this approach for text generation. We find that this method improves over standard benchmarks, while also providing fine-grained control.


Dual-track Music Generation using Deep Learning

arXiv.org Machine Learning

Music generation is always interesting in a sense that there is no formalized recipe. In this work, we propose a novel dual-track architecture for generating classical piano music, which is able to model the inter-dependency of left-hand and right-hand piano music. Particularly, we experimented with a lot of different models of neural network as well as different representations of music, and the results show that our proposed model outperforms all other tested methods. Besides, we deployed some special policies for model training and generation, which contributed to the model performance remarkably. Finally, under two evaluation methods, we compared our models with the MuseGAN project and true music.


Training and Classification using a Restricted Boltzmann Machine on the D-Wave 2000Q

arXiv.org Machine Learning

Restricted Boltzmann Machine (RBM) is an energy based, undirected graphical model. It is commonly used for unsupervised and supervised machine learning. Typically, RBM is trained using contrastive divergence (CD). However, training with CD is slow and does not estimate exact gradient of log-likelihood cost function. In this work, the model expectation of gradient learning for RBM has been calculated using a quantum annealer (D-Wave 2000Q), which is much faster than Markov chain Monte Carlo (MCMC) used in CD. Training and classification results are compared with CD. The classification accuracy results indicate similar performance of both methods. Image reconstruction as well as log-likelihood calculations are used to compare the performance of quantum and classical algorithms for RBM training. It is shown that the samples obtained from quantum annealer can be used to train a RBM on a 64-bit `bars and stripes' data set with classification performance similar to a RBM trained with CD. Though training based on CD showed improved learning performance, training using a quantum annealer eliminates computationally expensive MCMC steps of CD.


Efficient Reconstruction of Stochastic Pedigrees

arXiv.org Machine Learning

We introduce a new algorithm called {\sc Rec-Gen} for reconstructing the genealogy or \textit{pedigree} of an extant population purely from its genetic data. We justify our approach by giving a mathematical proof of the effectiveness of {\sc Rec-Gen} when applied to pedigrees from an idealized generative model that replicates some of the features of real-world pedigrees. Our algorithm is iterative and provides an accurate reconstruction of a large fraction of the pedigree while having relatively low \emph{sample complexity}, measured in terms of the length of the genetic sequences of the population. We propose our approach as a prototype for further investigation of the pedigree reconstruction problem toward the goal of applications to real-world examples. As such, our results have some conceptual bearing on the increasingly important issue of genomic privacy.


Inference, Prediction, and Entropy-Rate Estimation of Continuous-time, Discrete-event Processes

arXiv.org Machine Learning

Inferring models, predicting the future, and estimating the entropy rate of discrete-time, discrete-event processes is well-worn ground. However, a much broader class of discrete-event processes operates in continuous-time. Here, we provide new methods for inferring, predicting, and estimating them. The methods rely on an extension of Bayesian structural inference that takes advantage of neural network's universal approximation power. Based on experiments with complex synthetic data, the methods are competitive with the state-of-the-art for prediction and entropy-rate estimation.