Goto

Collaborating Authors

 Undirected Networks


Boltzmann Machines and Denoising Autoencoders for Image Denoising

arXiv.org Machine Learning

Image denoising based on a probabilistic model of local image patches has been employed by various researchers, and recently a deep (denoising) autoencoder has been proposed by Burger et al. [2012] and Xie et al. [2012] as a good model for this. In this paper, we propose that another popular family of models in the field of deep learning, called Boltzmann machines, can perform image denoising as well as, or in certain cases of high level of noise, better than denoising autoencoders. We empirically evaluate the two models on three different sets of images with different types and levels of noise. Throughout the experiments we also examine the effect of the depth of the models. The experiments confirmed our claim and revealed that the performance can be improved by adding more hidden layers, especially when the level of noise is high.


Continuous-time Infinite Dynamic Topic Models

arXiv.org Machine Learning

Topic models are probabilistic models for discovering topical themes in collections of documents. In real world applications, these models provide us with the means of organizing what would otherwise be unstructured collections. They can help us cluster a huge collection into different topics or find a subset of the collection that resembles the topical theme found in an article at hand. The first wave of topic models developed were able to discover the prevailing topics in a big collection of documents spanning a period of time. It was later realized that these time-invariant models were not capable of modeling 1) the time varying number of topics they discover and 2) the time changing structure of these topics. Few models were developed to address this two deficiencies. The online-hierarchical Dirichlet process models the documents with a time varying number of topics. It varies the structure of the topics over time as well. However, it relies on document order, not timestamps to evolve the model over time. The continuous-time dynamic topic model evolves topic structure in continuous-time. However, it uses a fixed number of topics over time. In this dissertation, I present a model, the continuous-time infinite dynamic topic model, that combines the advantages of these two models 1) the online-hierarchical Dirichlet process, and 2) the continuous-time dynamic topic model. More specifically, the model I present is a probabilistic topic model that does the following: 1) it changes the number of topics over continuous time, and 2) it changes the topic structure over continuous-time. I compared the model I developed with the two other models with different setting values. The results obtained were favorable to my model and showed the need for having a model that has a continuous-time varying number of topics and topic structure.


On learning parametric-output HMMs

arXiv.org Machine Learning

Hidden Markov Models (HMM) are a standard tool in the modeling and analysis of time series with a wide variety of applications. When the number of hidden states is known, the standard method for estimating the HMM parameters from given observed data is the Baum-Welch algorithm [Baum et al., 1970]. The latter is known to suffer from two serious drawbacks: it 1 tends to converge (i) very slowly and (ii) only to a local maximum. Indeed, the problem of recovering the parameters of a general HMM is provably hard, in several distinct senses [Abe and Warmuth, 1992, Lyngsรธ and Pedersen, 2001, Terwijn, 2002]. In this paper we consider learning parametric-output HMMs with a finite and known number of hidden states, where the output from each hidden state follows a parametric distribution from a given family.


Density Ratio Hidden Markov Models

arXiv.org Machine Learning

Masashi Sugiyama Department of Computer Science Tokyo Institute of Technology Tokyo 152-8552, Japan sugi@cs.titech.ac.jp Abstract Hidden Markov models and their variants are the predominant sequential classification method in such domains as speech recognition, bioinformatics and natural language processing. Being generative rather than discriminative models, however, their classification performance is a drawback. In this paper we apply ideas from the field of density ratio estimation to bypass the difficult step of learning likelihood functions in HMMs. By reformulating inference and model fitting in terms of density ratios and applying a fast kernel-based estimation method, we show that it is possible to obtain a striking increase in discriminative performance while retaining the probabilistic qualities of the HMM. We demonstrate experimentally that this formulation makes more efficient use of training data than alternative approaches. 1 Introduction Inference of a sequence of estimated classes from a sequence of noisy observations is fundamental in many applications. The hidden Markov model (HMM) and its variants are the usual methods employed to do this, and have been used with conspicuous success in such domains as speech recognition, bioinformatics and natural language processing. As well as being computationally efficient, they are a popular choice due to their intuitive probabilistic interpretation.


Toric grammars: a new statistical approach to natural language modeling

arXiv.org Machine Learning

We propose a new statistical model for computational linguistics. Rather than trying to estimate directly the probability distribution of a random sentence of the language, we define a Markov chain on finite sets of sentences with many finite recurrent communicating classes and define our language model as the invariant probability measures of the chain on each recurrent communicating class. This Markov chain, that we call a communication model, recombines at each step randomly the set of sentences forming its current state, using some grammar rules. When the grammar rules are fixed and known in advance instead of being estimated on the fly, we can prove supplementary mathematical properties. In particular, we can prove in this case that all states are recurrent states, so that the chain defines a partition of its state space into finite recurrent communicating classes. We show that our approach is a decisive departure from Markov models at the sentence level and discuss its relationships with Context Free Grammars. Although the toric grammars we use are closely related to Context Free Grammars, the way we generate the language from the grammar is qualitatively different. Our communication model has two purposes. On the one hand, it is used to define indirectly the probability distribution of a random sentence of the language. On the other hand it can serve as a (crude) model of language transmission from one speaker to another speaker through the communication of a (large) set of sentences.


Fast Value Iteration for Goal-Directed Markov Decision Processes

arXiv.org Artificial Intelligence

Planning problems where effects of actions are non-deterministic can be modeled a8 Markov decision processes. Planning problems are usually goal-directed. This paper proposes several techniques for exploiting the goal-directedness to accelerate value itera tion, a standard algorithm for solving Markov decision processes. Empirical studies have shown that the techniques can bring about significant speedups.


Region-Based Approximations for Planning in Stochastic Domains

arXiv.org Artificial Intelligence

This paper is concerned with planning in stochastic domains by means of partially observable Markov decision processes (POMDPs). POMDPs are difficult to solve. This paper identifies a subclass of POMDPs called region observable POMDPs, which are easier to solve and can be used to approximate general POMDPs to arbitrary accuracy. Keywords: planning under uncertainty, partially observable Markov decision processes, problem characteristics.


Time-Critical Reasoning: Representations and Application

arXiv.org Artificial Intelligence

We review the problem of time-critical action and discuss a reformulation that shifts knowledge acquisition from the assessment of complex temporal probabilistic dependencies to the direct assessment of time-dependent utilities over key outcomes of interest. We dwell on a class of decision problems characterized by the centrality of diagnosing and reacting in a timely manner to pathological processes. We motivate key ideas in the context of trauma-care triage and transportation decisions.


Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes

arXiv.org Artificial Intelligence

Most exact algorithms for general partially observable Markov decision processes (POMDPs) use a form of dynamic programming in which a piecewise-linear and convex representation of one value function is transformed into another. We examine variations of the "incremental pruning" method for solving this problem and compare them to earlier algorithms from theoretical and empirical perspectives. We find that incremental pruning is presently the most efficient exact method for solving POMDPs.


Hierarchical Solution of Markov Decision Processes using Macro-actions

arXiv.org Artificial Intelligence

We investigate the use of temporally abstract actions, or macro-actions, in the solution of Markov decision processes. Unlike current models that combine both primitive actions and macro-actions and leave the state space unchanged, we propose a hierarchical model (using an abstract MDP) that works with macro-actions only, and that significantly reduces the size of the state space. This is achieved by treating macroactions as local policies that act in certain regions of state space, and by restricting states in the abstract MDP to those at the boundaries of regions. The abstract MDP approximates the original and can be solved more efficiently. We discuss several ways in which macro-actions can be generated to ensure good solution quality. Finally, we consider ways in which macro-actions can be reused to solve multiple, related MDPs; and we show that this can justify the computational overhead of macro-action generation.