Gaussian-binary Restricted Boltzmann Machines on Modeling Natural Image Statistics

arXiv.org Machine Learning

We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model's capabilities and limitations. We show that GRBMs are capable of learning meaningful features both in a two-dimensional blind source separation task and in modeling natural images. Further, we show that reported difficulties in training GRBMs are due to the failure of the training algorithm rather than the model itself. Based on our analysis we are able to propose several training recipes, which allowed successful and fast training in our experiments. Finally, we discuss the relationship of GRBMs to several modifications that have been proposed to improve the model.


A Generalization of the Chow-Liu Algorithm and its Application to Statistical Learning

arXiv.org Artificial Intelligence

We extend the Chow-Liu algorithm for general random variables while the previous versions only considered finite cases. In particular, this paper applies the generalization to Suzuki's learning algorithm that generates from data forests rather than trees based on the minimum description length by balancing the fitness of the data to the forest and the simplicity of the forest. As a result, we successfully obtain an algorithm when both of the Gaussian and finite random variables are present.



Performance Impact Caused by Hidden Bias of Training Data for Recognizing Textual Entailment

arXiv.org Artificial Intelligence

The quality of training data is one of the crucial problems when a learning-centered approach is employed. This paper proposes a new method to investigate the quality of a large corpus designed for the recognizing textual entailment (RTE) task. The proposed method, which is inspired by a statistical hypothesis test, consists of two phases: the first phase is to introduce the predictability of textual entailment labels as a null hypothesis which is extremely unacceptable if a target corpus has no hidden bias, and the second phase is to test the null hypothesis using a Naive Bayes model. The experimental result of the Stanford Natural Language Inference (SNLI) corpus does not reject the null hypothesis. Therefore, it indicates that the SNLI corpus has a hidden bias which allows prediction of textual entailment labels from hypothesis sentences even if no context information is given by a premise sentence. This paper also presents the performance impact of NN models for RTE caused by this hidden bias.


Abnormal Activity Recognition based on HDP-HMM Models

AAAI Conferences

Detecting abnormal activities from sensor readings is an important research problem in activity recognition. A number of different algorithms have been proposed in the past to tackle this problem. Many of the previous state-based approaches suffer from the problem of failing to decide the appropriate number of states, which are difficult to find through a trial and-error approach, in real-world applications. In this paper, we propose an accurate and flexible framework for abnormal activity recognition from sensor readings that involves less human tuning of model parameters. Our approach first applies a Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM), which supports an infinite number of states, to automatically find an appropriate number of states. We incorporate a Fisher Kernel into the One-Class Support Vector Machine (OCSVM) model to filter out the activities that are likely to be normal. Finally, we derive an abnormal activity model from the normal activity models to reduce false positive rate in an unsupervised manner. Our main contribution is that our proposed HDP-HMM models can decide the appropriate number of states automatically, and that by incorporating a Fisher Kernel into the OCSVM model, we can combine the advantages from generative model and discriminative model. We demonstrate the effectiveness of our approach by using several real-world datasets to test our algorithm’s performance.