Genre
Gibbs Max-margin Topic Models with Data Augmentation
Zhu, Jun, Chen, Ning, Perkins, Hugh, Zhang, Bo
Max-margin learning is a powerful approach to building classifiers and structured output predictors. Recent work on max-margin supervised topic models has successfully integrated it with Bayesian topic models to discover discriminative latent semantic structures and make accurate predictions for unseen testing data. However, the resulting learning problems are usually hard to solve because of the non-smoothness of the margin loss. Existing approaches to building max-margin supervised topic models rely on an iterative procedure to solve multiple latent SVM subproblems with additional mean-field assumptions on the desired posterior distributions. This paper presents an alternative approach by defining a new max-margin loss. Namely, we present Gibbs max-margin supervised topic models, a latent variable Gibbs classifier to discover hidden topic representations for various tasks, including classification, regression and multi-task learning. Gibbs max-margin supervised topic models minimize an expected margin loss, which is an upper bound of the existing margin loss derived from an expected prediction rule. By introducing augmented variables and integrating out the Dirichlet variables analytically by conjugacy, we develop simple Gibbs sampling algorithms with no restricting assumptions and no need to solve SVM subproblems. Furthermore, each step of the "augment-and-collapse" Gibbs sampling algorithms has an analytical conditional distribution, from which samples can be easily drawn. Experimental results demonstrate significant improvements on time efficiency. The classification performance is also significantly improved over competitors on binary, multi-class and multi-label classification tasks.
Kronecker Sum Decompositions of Space-Time Data
Greenewald, Kristjan, Tsiligkaridis, Theodoros, Hero, Alfred O III
Abstract--In this paper we consider the use of the space vs. time Kronecker product decomposition in the estimation of covariance matrices for spatiotemporal data. This decomposition imposes lower dimensional structure on the estimated covariance matrix, thus reducing the number of samples required for estimation. T o allow a smooth tradeoff between the reduction in the number of parameters (to reduce estimation variance) and the accuracy of the covariance approximation (affecting estimation bias), we introduce a diagonally loaded modification of the sum-of-kronecker products representation in [1]. We derive an asymptotic Cram er-Rao bound (CRB) on the minimum attainable mean squared predictor coefficient estimation error for unbiased estimators of Kronecker structured covariance matrices. We illustrate the accuracy of the diagonally loaded Kronecker sum decomposition by applying it to the prediction of human activity video. In this paper, we develop a method for estimation of spatiotemporal covariance and apply it to video modeling and prediction. The covariance for spatiotemporal processes manifests itself as multiframe covariance, i.e. the covariance not only between pixels or features in a single frame, but also between pixels or features in a set of nearby frames.
Privacy Aware Learning
Duchi, John C., Jordan, Michael I., Wainwright, Martin J.
Natural tensions between learning and privacy arise whenever a learner must aggregate data across multiple individuals. The learner wishes to make optimal use of each data point, whereas the providers of the data may wish to limit detailed exposure, either to the learner or to other individuals. A characterization of such tensions in the form of quantitative tradeoffs is of great utility: it can inform public discourse surrounding the design of systems that learn from data, and the tradeoffs can be exploited as controllable degrees of freedom whenever such a system is deployed. In this paper, we approach this problem from the point of view of statistical decision theory. The decision-theoretic perspective offers a number of advantages.
Robust Dequantized Compressive Sensing
We consider the reconstruction problem in compressed sensing in which the observations are recorded in a finite number of bits. They may thus contain quantization errors (from being rounded to the nearest representable value) and saturation errors (from being outside the range of representable values). Our formulation has an objective of weighted $\ell_2$-$\ell_1$ type, along with constraints that account explicitly for quantization and saturation errors, and is solved with an augmented Lagrangian method. We prove a consistency result for the recovered solution, stronger than those that have appeared to date in the literature, showing in particular that asymptotic consistency can be obtained without oversampling. We present extensive computational comparisons with formulations proposed previously, and variants thereof.
Spontaneous Analogy by Piggybacking on a Perceptual System
Most computational models of analogy assume they are given a delineated source domain and often a specified target domain. These systems do not address how analogs can be isolated from large domains and spontaneously retrieved from long-term memory, a process we call spontaneous analogy. We present a system that represents relational structures as feature bags. Using this representation, our system leverages perceptual algorithms to automatically create an ontology of relational structures and to efficiently retrieve analogs for new relational structures from long-term memory. We provide a demonstration of our approach that takes a set of unsegmented stories, constructs an ontology of analogical schemas (corresponding to plot devices), and uses this ontology to efficiently find analogs within new stories, yielding significant time-savings over linear analog retrieval at a small accuracy cost.
MizAR 40 for Mizar 40
Kaliszyk, Cezary, Urban, Josef
As a present to Mizar on its 40th anniversary, we develop an AI/ATP system that in 30 seconds of real time on a 14-CPU machine automatically proves 40% of the theorems in the latest official version of the Mizar Mathematical Library (MML). This is a considerable improvement over previous performance of large- theory AI/ATP methods measured on the whole MML. To achieve that, a large suite of AI/ATP methods is employed and further developed. We implement the most useful methods efficiently, to scale them to the 150000 formulas in MML. This reduces the training times over the corpus to 1-3 seconds, allowing a simple practical deployment of the methods in the online automated reasoning service for the Mizar users (MizAR).
Duality in Graphical Models
Malouche, Dhafer, Rajaratnam, Bala, Rolfs, Benjamin T.
Graphical models have proven to be powerful tools for representing high-dimensional systems of random variables. One example of such a model is the undirected graph, in which lack of an edge represents conditional independence between two random variables given the rest. Another example is the bidirected graph, in which absence of edges encodes pairwise marginal independence. Both of these classes of graphical models have been extensively studied, and while they are considered to be dual to one another, except in a few instances this duality has not been thoroughly investigated. In this paper, we demonstrate how duality between undirected and bidirected models can be used to transport results for one class of graphical models to the dual model in a transparent manner. We proceed to apply this technique to extend previously existing results as well as to prove new ones, in three important domains. First, we discuss the pairwise and global Markov properties for undirected and bidirected models, using the pseudographoid and reverse-pseudographoid rules which are weaker conditions than the typically used intersection and composition rules. Second, we investigate these pseudographoid and reverse pseudographoid rules in the context of probability distributions, using the concept of duality in the process. Duality allows us to quickly relate them to the more familiar intersection and composition properties. Third and finally, we apply the dualization method to understand the implications of faithfulness, which in turn leads to a more general form of an existing result.
Discriminative Relational Topic Models
Chen, Ning, Zhu, Jun, Xia, Fei, Zhang, Bo
Many scientific and engineering fields involve analyzing network data. For document networks, relational topic models (RTMs) provide a probabilistic generative process to describe both the link structure and document contents, and they have shown promise on predicting network structures and discovering latent topic representations. However, existing RTMs have limitations in both the restricted model expressiveness and incapability of dealing with imbalanced network data. To expand the scope and improve the inference accuracy of RTMs, this paper presents three extensions: 1) unlike the common link likelihood with a diagonal weight matrix that allows the-same-topic interactions only, we generalize it to use a full weight matrix that captures all pairwise topic interactions and is applicable to asymmetric networks; 2) instead of doing standard Bayesian inference, we perform regularized Bayesian inference (RegBayes) with a regularization parameter to deal with the imbalanced link structure issue in common real networks and improve the discriminative ability of learned latent representations; and 3) instead of doing variational approximation with strict mean-field assumptions, we present collapsed Gibbs sampling algorithms for the generalized relational topic models by exploring data augmentation without making restricting assumptions. Under the generic RegBayes framework, we carefully investigate two popular discriminative loss functions, namely, the logistic log-loss and the max-margin hinge loss. Experimental results on several real network datasets demonstrate the significance of these extensions on improving the prediction performance, and the time efficiency can be dramatically improved with a simple fast approximation method.
Improved Bayesian Logistic Supervised Topic Models with Data Augmentation
Zhu, Jun, Zheng, Xun, Zhang, Bo
Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually over-weighted by document word counts; and 2) existing variational inference methods make strict mean-field assumptions. We address these issues by: 1) introducing a regularization constant to better balance the two parts based on an optimization formulation of Bayesian inference; and 2) developing a simple Gibbs sampling algorithm by introducing auxiliary Polya-Gamma variables and collapsing out Dirichlet variables. Our augment-and-collapse sampling algorithm has analytical forms of each conditional distribution without making any restricting assumptions and can be easily parallelized. Empirical results demonstrate significant improvements on prediction performance and time efficiency.
Confidence-constrained joint sparsity recovery under the Poisson noise model
Chunikhina, E., Raich, R., Nguyen, T.
Our work is focused on the joint sparsity recovery problem where the common sparsity pattern is corrupted by Poisson noise. We formulate the confidence-constrained optimization problem in both least squares (LS) and maximum likelihood (ML) frameworks and study the conditions for perfect reconstruction of the original row sparsity and row sparsity pattern. However, the confidence-constrained optimization problem is non-convex. Using convex relaxation, an alternative convex reformulation of the problem is proposed. We evaluate the performance of the proposed approach using simulation results on synthetic data and show the effectiveness of proposed row sparsity and row sparsity pattern recovery framework.