Goto

Collaborating Authors

 Learning Graphical Models


Nonparametric Relational Topic Models through Dependent Gamma Processes

arXiv.org Machine Learning

Traditional Relational Topic Models provide a way to discover the hidden topics from a document network. Many theoretical and practical tasks, such as dimensional reduction, document clustering, link prediction, benefit from this revealed knowledge. However, existing relational topic models are based on an assumption that the number of hidden topics is known in advance, and this is impractical in many real-world applications. Therefore, in order to relax this assumption, we propose a nonparametric relational topic model in this paper. Instead of using fixed-dimensional probability distributions in its generative model, we use stochastic processes. Specifically, a gamma process is assigned to each document, which represents the topic interest of this document. Although this method provides an elegant solution, it brings additional challenges when mathematically modeling the inherent network structure of typical document network, i.e., two spatially closer documents tend to have more similar topics. Furthermore, we require that the topics are shared by all the documents. In order to resolve these challenges, we use a subsampling strategy to assign each document a different gamma process from the global gamma process, and the subsampling probabilities of documents are assigned with a Markov Random Field constraint that inherits the document network structure. Through the designed posterior inference algorithm, we can discover the hidden topics and its number simultaneously. Experimental results on both synthetic and real-world network datasets demonstrate the capabilities of learning the hidden topics and, more importantly, the number of topics.


Robust Bayesian compressive sensing with data loss recovery for structural health monitoring signals

arXiv.org Machine Learning

The application of compressive sensing (CS) to structural health monitoring is an emerging research topic. The basic idea in CS is to use a specially-designed wireless sensor to sample signals that are sparse in some basis (e.g. wavelet basis) directly in a compressed form, and then to reconstruct (decompress) these signals accurately using some inversion algorithm after transmission to a central processing unit. However, most signals in structural health monitoring are only approximately sparse, i.e. only a relatively small number of the signal coefficients in some basis are significant, but the other coefficients are usually not exactly zero. In this case, perfect reconstruction from compressed measurements is not expected. A new Bayesian CS algorithm is proposed in which robust treatment of the uncertain parameters is explored, including integration over the prediction-error precision parameter to remove it as a "nuisance" parameter. The performance of the new CS algorithm is investigated using compressed data from accelerometers installed on a space-frame structure and on a cable-stayed bridge. Compared with other state-of-the-art CS methods including our previously-published Bayesian method which uses MAP (maximum a posteriori) estimation of the prediction-error precision parameter, the new algorithm shows superior performance in reconstruction robustness and posterior uncertainty quantification. Furthermore, our method can be utilized for recovery of lost data during wireless transmission, regardless of the level of sparseness in the signal.


Inferring Team Task Plans from Human Meetings: A Generative Modeling Approach with Logic-Based Prior

Journal of Artificial Intelligence Research

We aim to reduce the burden of programming and deploying autonomous systems to work in concert with people in time-critical domains such as military field operations and disaster response. Deployment plans for these operations are frequently negotiated on-the-fly by teams of human planners. A human operator then translates the agreed-upon plan into machine instructions for the robots. We present an algorithm that reduces this translation burden by inferring the final plan from a processed form of the human team's planning conversation. Our hybrid approach combines probabilistic generative modeling with logical plan validation used to compute a highly structured prior over possible plans, enabling us to overcome the challenge of performing inference over a large solution space with only a small amount of noisy data from the team planning session. We validate the algorithm through human subject experimentations and show that it is able to infer a human team's final plan with 86% accuracy on average. We also describe a robot demonstration in which two people plan and execute a first-response collaborative task with a PR2 robot. To the best of our knowledge, this is the first work to integrate a logical planning technique within a generative model to perform plan inference.


Bayesian Cross Validation and WAIC for Predictive Prior Design in Regular Asymptotic Theory

arXiv.org Machine Learning

Prior design is one of the most important problems in both statistics and machine learning. The cross validation (CV) and the widely applicable information criterion (WAIC) are predictive measures of the Bayesian estimation, however, it has been difficult to apply them to find the optimal prior because their mathematical properties in prior evaluation have been unknown and the region of the hyperparameters is too wide to be examined. In this paper, we derive a new formula by which the theoretical relation among CV, WAIC, and the generalization loss is clarified and the optimal hyperparameter can be directly found. By the formula, three facts are clarified about predictive prior design. Firstly, CV and WAIC have the same second order asymptotic expansion, hence they are asymptotically equivalent to each other as the optimizer of the hyperparameter. Secondly, the hyperparameter which minimizes CV or WAIC makes the average generalization loss to be minimized asymptotically but does not the random generalization loss. And lastly, by using the mathematical relation between priors, the variances of the optimized hyperparameters by CV and WAIC are made smaller with small computational costs. Also we show that the optimized hyperparameter by DIC or the marginal likelihood does not minimize the average or random generalization loss in general.


Variational Optimization of Annealing Schedules

arXiv.org Machine Learning

Annealed importance sampling (AIS) is a common algorithm to estimate partition functions of useful stochastic models. One important problem for obtaining accurate AIS estimates is the selection of an annealing schedule. Conventionally, an annealing schedule is often determined heuristically or is simply set as a linearly increasing sequence. In this paper, we propose an algorithm for the optimal schedule by deriving a functional that dominates the AIS estimation error and by numerically minimizing this functional. We experimentally demonstrate that the proposed algorithm mostly outperforms conventional scheduling schemes with large quantization numbers.


Bayesian Reconstruction of Missing Observations

arXiv.org Machine Learning

We focus on an interpolation method referred to Bayesian reconstruction in this paper. Whereas in standard interpolation methods missing data are interpolated deterministically, in Bayesian reconstruction, missing data are interpolated probabilistically using a Bayesian treatment. In this paper, we address the framework of Bayesian reconstruction and its application to the traffic data reconstruction problem in the field of traffic engineering. In the latter part of this paper, we describe the evaluation of the statistical performance of our Bayesian traffic reconstruction model using a statistical mechanical approach and clarify its statistical behavior.


On Gridless Sparse Methods for Line Spectral Estimation From Complete and Incomplete Data

arXiv.org Machine Learning

Abstract--This paper is concerned about sparse, continuous frequency estimation in line spectral estimation, and focused on developing gridless sparse methods which overcome grid mismatches and correspond to limiting scenarios of existing grid-based approaches, e.g., We generalize AST (atomic-norm soft thresholding) to the case of nonconsecutively sampled data (incomplete data) inspired by recent atomic norm based techniques. We present a gridless version of SPICE (gridless SPICE, or GLS), which is applicable to both complete and incomplete data without the knowledge of noise level. We further prove the equivalence between GLS and atomic norm-based techniques under different assumptions of noise. Moreover, we extend GLS to a systematic framework consisting of model order selection and robust frequency estimation, and present feasible algorithms for AST and GLS. Numerical simulations are provided to validate our theoretical analysis and demonstrate performance of our methods compared to existing ones. Spectral analysis of signals [1] is a major problem in statistical signal processing. In this paper we are concerned about the line spectral estimation problem which has wide applications in communications, radar, sonar, seismology, astronomy and so on. C is the measurement noise. The sinusoid numberK M, usually referred to as the model order, is typically unknown in practice. Following from [2], the case when the signal is observed on [M ] is referred to as the complete data case while the other case when only samples on Ω [M ] are available is called the incomplete data case (or missing data case), in which the samples on the complementary set of Ω, Ω, [M ]\ Ω, are called missing data. Manuscript November 2013; accepted by IEEE Transactions on Signal Processing March 2015. The authors are with the School of Electrical and Electronic Engineering, Nanyang Technological University, 639798, Singapore (email: { yangzai, elhxie } @ntu.edu.sg). Frequency estimation and model order selection are two important topics in line spectral estimation. 's can be obtained by a simple least-squares method according to (1). This paper is mainly focused on frequency estimation but we also incorporate existing model order selection tools in our methods. Many methods have been proposed for frequency estimation. Common classical methods include periodogram (or beamforming), nonlinear least squares (NLS) and MUSIC but often have limitations (see the review in [1]). For example, the periodogram suffers from leakage problems and have difficulties in resolving closely separated frequencies [1]. It is worth noting that the recent iterative adaptive approach (IAA) [4], [5] reduces the leakage of periodogram.


Accuracy of Latent-Variable Estimation in Bayesian Semi-Supervised Learning

arXiv.org Machine Learning

Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data, performs better when the model is well specified.


Asymmetric Distributions from Constrained Mixtures

arXiv.org Machine Learning

This paper introduces constrained mixtures for continuous distributions, characterized by a mixture of distributions where each distribution has a shape similar to the base distribution and disjoint domains. This new concept is used to create generalized asymmetric versions of the Laplace and normal distributions, which are shown to define exponential families, with known conjugate priors, and to have maximum likelihood estimates for the original parameters, with known closed-form expressions. The asymmetric and symmetric normal distributions are compared in a linear regression example, showing that the asymmetric version performs at least as well as the symmetric one, and in a real world time-series problem, where a hidden Markov model is used to fit a stock index, indicating that the asymmetric version provides higher likelihood and may learn distribution models over states and transition distributions with considerably less entropy.


Hierarchical sparse Bayesian learning: theory and application for inferring structural damage from incomplete modal data

arXiv.org Machine Learning

Structural damage due to excessive loading or environmental degradation typically occurs in localized areas in the absence of collapse. This prior information about the spatial sparseness of structural damage is exploited here by a hierarchical sparse Bayesian learning framework with the goal of reducing the source of ill-conditioning in the stiffness loss inversion problem for damage detection. Sparse Bayesian learning methodologies automatically prune away irrelevant or inactive features from a set of potential candidates, and so they are effective probabilistic tools for producing sparse explanatory subsets. We have previously proposed such an approach to establish the probability of localized stiffness reductions that serve as a proxy for damage by using noisy incomplete modal data from before and after possible damage. The core idea centers on a specific hierarchical Bayesian model that promotes spatial sparseness in the inferred stiffness reductions in a way that is consistent with the Bayesian Ockham razor. In this paper, we improve the theory of our previously proposed sparse Bayesian learning approach by eliminating an approximation and, more importantly, incorporating a constraint on stiffness increases. Our approach has many appealing features that are summarized at the end of the paper. We validate the approach by applying it to the Phase II simulated and experimental benchmark studies sponsored by the IASC-ASCE Task Group on Structural Health Monitoring. The results show that it can reliably detect, locate and assess damage by inferring substructure stiffness losses from the identified modal parameters. The occurrence of missed and false damage alerts is effectively suppressed.