Bayesian Inference
Approximate Bayesian Inference for Reconstructing Velocities of Migrating Birds from Weather Radar
Sheldon, Daniel (University of Massachusetts Amherst) | Farnsworth, Andrew (Cornell Lab of Ornithology) | Irvine, Jed W. (Oregon State University) | Doren, Benjamin Van (Cornell University) | Webb, Kevin F. (Cornell Lab of Ornithology) | Dietterich, Thomas G. (Oregon State University) | Kelling, Steve (Cornell Lab of Ornithology)
Archived data from the WSR-88D network of weather radars in the US hold detailed information about the continent-scale migratory movements of birds over the last 20 years. However, significant technical challenges must be overcome to understand this information and harness its potential for science and conservation. We present an approximate Bayesian inference algorithm to reconstruct the velocity fields of birds migrating in the vicinity of a radar station. This is part of a larger project to quantify bird migration at large scales using weather radar data.
GiSS: Combining Gibbs Sampling and SampleSearch for Inference in Mixed Probabilistic and Deterministic Graphical Models
Venugopal, Deepak (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)
Mixed probabilistic and deterministic graphical models are ubiquitous in real-world applications. Unfortunately, Gibbs sampling, a popular MCMC technique, does not converge to the correct answers in presence of determinism and therefore cannot be used for inference in such models. In this paper, we propose to remedy this problem by combining Gibbs sampling with SampleSearch, an advanced importance sampling technique which leverages complete SAT/CSP solvers to generate high quality samples from hard deterministic spaces. We call the resulting algorithm, GiSS. Unlike Gibbs sampling which yields unweighted samples, GiSS yields weighted samples. Computing these weights exactly can be computationally expensive and therefore we propose several approximations. We show that our approximate weighting schemes yield consistent estimates and demonstrate experimentally that GiSS is competitive in terms of accuracy with state-of-the-art algorithms such as SampleSearch, MC-SAT and Belief propagation.
A Hierarchical Aspect-Sentiment Model for Online Reviews
Kim, Suin (KAIST) | Zhang, Jianwen (Microsoft Research Asia) | Chen, Zheng (Microsoft Research Asia) | Oh, Alice (KAIST) | Liu, Shixia (Microsoft Research Asia)
To help users quickly understand the major opinions from massive online reviews, it is important to automatically reveal the latent structure of the aspects, sentiment polarities, and the association between them. However, there is little work available to do this effectively. In this paper, we propose a hierarchical aspect sentiment model (HASM) to discover a hierarchical structure of aspect-based sentiments from unlabeled online reviews. In HASM, the whole structure is a tree. Each node itself is a two-level tree, whose root represents an aspect and the children represent the sentiment polarities associated with it. Each aspect or sentiment polarity is modeled as a distribution of words. To automatically extract both the structure and parameters of the tree, we use a Bayesian nonparametric model, recursive Chinese Restaurant Process (rCRP), as the prior and jointly infer the aspect-sentiment tree from the review texts. Experiments on two real datasets show that our model is comparable to two other hierarchical topic models in terms of quantitative measures of topic trees. It is also shown that our model achieves better sentence-level classification accuracy than previously proposed aspect-sentiment joint models.
Complexity of Inferences in Polytree-shaped Semi-Qualitative Probabilistic Networks
Campos, Cassio Polpo de (Dalle Molle Institute for Artificial Intelligence) | Cozman, Fabio Gagliardi (University of Sao Paulo)
Semi-qualitative probabilistic networks (SQPNs) merge two important graphical model formalisms: Bayesian networks and qualitative probabilistic networks. They provide a very general modeling framework by allowing the combination of numeric and qualitative assessments over a discrete domain, and can be compactly encoded by exploiting the same factorization of joint probability distributions that are behind the Bayesian networks.ย This paper explores the computational complexity of semi-qualitative probabilistic networks, and takes the polytree-shaped networks as its main target. We show that the inference problem is coNP-Complete for binary polytrees with multiple observed nodes. We also show that inferences can be performed in time linear in the number of nodes if there is a single observed node. Because our proof is constructive, we obtain an efficient linear time algorithm for SQPNs under such assumptions. To the best of our knowledge, this is the first exact polynomial-time algorithm for SQPNs. Together these results provide a clear picture of the inferential complexity in polytree-shaped SQPNs.
From Interest to Function: Location Estimation in Social Media
Chen, Yan (Beihang University) | Zhao, Jichang (Beihang University) | Hu, Xia (Arizona State University) | Zhang, Xiaoming (Beihang University) | Li, Zhoujun (Beihang University) | Chua, Tat-Seng (National University of Singapore)
Recent years have witnessed the tremendous development of social media, which attracts a vast number of Internet users. The high-dimension content generated by these users provides an unique opportunity to understand their behavior deeply. As one of the most fundamental topics, location estimation attracts more and more research efforts. Different from the previous literature, we find that user's location is strongly related to user interest. Based on this, we first build a detection model to mine user interest from short text. We then establish the mapping between location function and user interest before presenting an efficient framework to predict the user's location with convincing fidelity. Thorough evaluations and comparisons on an authentic data set show that our proposed model significantly outperforms the state-of-the-arts approaches. Moreover, the high efficiency of our model also guarantees its applicability in real-world scenarios.
A Kernel Density Estimate-Based Approach to Component Goodness Modeling
Cardoso, Nuno (University of Porto /ย HASLab - INESC Tec) | Abreu, Rui (University of Portoย /ย HASLab - INESC Tec)
Intermittent fault localization approaches account for the fact that faulty components may fail intermittently by considering a parameter (known as goodness) that quantifies the probability that faulty components may still exhibit correct behavior. Current, state-of-the-art approaches (1) assume that this goodness probability is context independent and (2) do not provide means for integrating past diagnosis experience in the diagnostic mechanism. In this paper, we present a novel approach, coined Non-linear Feedback-based Goodness Estimate (NFGE), that uses kernel density estimations (KDE) to address such limitations. We evaluated the approach with both synthetic and real data, yielding lower estimation errors, thus increasing the diagnosis performance.
Teamwork with Limited Knowledge of Teammates
Barrett, Samuel (The University of Texas at Austin) | Stone, Peter (The University of Texas at Austin) | Kraus, Sarit (Bar-Ilan University and The University of Maryland) | Rosenfeld, Avi (Jerusalem College of Technology)
While great strides have been made in multiagent teamwork, existing approaches typically assume extensive information exists about teammates and how to coordinate actions. This paper addresses how robust teamwork can still be created even if limited or no information exists about a specific group of teammates, as in the ad hoc teamwork scenario. The main contribution of this paper is the first empirical evaluation of an agent cooperating with teammates not created by the authors, where the agent is not provided expert knowledge of its teammates. For this purpose, we develop a general-purpose teammate modeling method and test the resulting ad hoc team agent's ability to collaborate with more than 40 unknown teams of agents to accomplish a benchmark task. These agents were designed by people other than the authors without these designers planning for the ad hoc teamwork setting. A secondary contribution of the paper is a new transfer learning algorithm, TwoStageTransfer, that can improve results when the ad hoc team agent does have some limited observations of its current teammates.
Bayesian Discovery of Multiple Bayesian Networks via Transfer Learning
Bayesian network structure learning algorithms with limited data are being used in domains such as systems biology and neuroscience to gain insight into the underlying processes that produce observed data. Learning reliable networks from limited data is difficult, therefore transfer learning can improve the robustness of learned networks by leveraging data from related tasks. Existing transfer learning algorithms for Bayesian network structure learning give a single maximum a posteriori estimate of network models. Yet, many other models may be equally likely, and so a more informative result is provided by Bayesian structure discovery. Bayesian structure discovery algorithms estimate posterior probabilities of structural features, such as edges. We present transfer learning for Bayesian structure discovery which allows us to explore the shared and unique structural features among related tasks. Efficient computation requires that our transfer learning objective factors into local calculations, which we prove is given by a broad class of transfer biases. Theoretically, we show the efficiency of our approach. Empirically, we show that compared to single task learning, transfer learning is better able to positively identify true edges. We apply the method to whole-brain neuroimaging data.
Bridging Information Criteria and Parameter Shrinkage for Model Selection
Zhang, Kun, Peng, Heng, Chan, Laiwan, Hyvarinen, Aapo
Model selection based on classical information criteria, such as BIC, is generally computationally demanding, but its properties are well studied. On the other hand, model selection based on parameter shrinkage by $\ell_1$-type penalties is computationally efficient. In this paper we make an attempt to combine their strengths, and propose a simple approach that penalizes the likelihood with data-dependent $\ell_1$ penalties as in adaptive Lasso and exploits a fixed penalization parameter. Even for finite samples, its model selection results approximately coincide with those based on information criteria; in particular, we show that in some special cases, this approach and the corresponding information criterion produce exactly the same model. One can also consider this approach as a way to directly determine the penalization parameter in adaptive Lasso to achieve information criteria-like model selection. As extensions, we apply this idea to complex models including Gaussian mixture model and mixture of factor analyzers, whose model selection is traditionally difficult to do; by adopting suitable penalties, we provide continuous approximators to the corresponding information criteria, which are easy to optimize and enable efficient model selection.
An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
In order to cluster or partition data, we often use Expectation-and-Maximization (EM) or Variational approximation with a Gaussian Mixture Model (GMM), which is a parametric probability density function represented as a weighted sum of $\hat{K}$ Gaussian component densities. However, model selection to find underlying $\hat{K}$ is one of the key concerns in GMM clustering, since we can obtain the desired clusters only when $\hat{K}$ is known. In this paper, we propose a new model selection algorithm to explore $\hat{K}$ in a Bayesian framework. The proposed algorithm builds the density of the model order which any information criterions such as AIC and BIC basically fail to reconstruct. In addition, this algorithm reconstructs the density quickly as compared to the time-consuming Monte Carlo simulation.