Undirected Networks
Convex Approaches to Model Wavelet Sparsity Patterns
Rao, Nikhil S, Nowak, Robert D., Wright, Stephen J., Kingsbury, Nick G.
Statistical dependencies among wavelet coefficients are commonly represented by graphical models such as hidden Markov trees(HMTs). However, in linear inverse problems such as deconvolution, tomography, and compressed sensing, the presence of a sensing or observation matrix produces a linear mixing of the simple Markovian dependency structure. This leads to reconstruction problems that are non-convex optimizations. Past work has dealt with this issue by resorting to greedy or suboptimal iterative reconstruction methods. In this paper, we propose new modeling approaches based on group-sparsity penalties that leads to convex optimizations that can be solved exactly and efficiently. We show that the methods we develop perform significantly better in deconvolution and compressed sensing applications, while being as computationally efficient as standard coefficient-wise approaches such as lasso.
High-Dimensional Inference with the generalized Hopfield Model: Principal Component Analysis and Corrections
Cocco, Simona, Monasson, Remi, Sessak, Vitor
Understanding the patterns of correlations between the components of complex systems is a fundamental issue in various scientific fields, ranging from neurobiology to genomic, from finance to sociology,... A recurrent problem is to distinguish between direct correlations, produced by physiological or functional interactions between the components, and network correlations, which are mediated by other, third-party components. Various approaches have been proposed to infer interactions from correlations, exploiting concepts related to statistical dimensional reduction [1], causality [2], the maximum entropy principle [3], Markov random fields [4]... A major practical and theoretical difficulty in doing so is the paucity and the quality of data: reliable analysis should be able to unveil real patterns of interactions, even if measures are affected by under-or noisy sampling. The size of the interaction network can be comparable to or larger than the number of data, a situation referred to as highdimensional inference. The purpose of the present work is to establish a quantitative correspondence between two of those approaches, namely the inference of Boltzmann Machines (also called Ising model in statistical physics and undirected graphical models for discrete variables in statistical inference [4]) and Principal Component Analysis (PCA) [1]. Inverse Boltzmann Machines (BM) are a mathematically well-founded but computationally challenging approach to infer interactions from correlations.
Deep Transfer: A Markov Logic Approach
Davis, Jesse (Katholieke Universiteit Leuven) | Domingos, Pedro (University of Washington)
This article argues that currently the largest gap between human and machine learning is learning algorithms' inability to perform deep transfer, that is, generalize from one domain to another domain containing different objects, classes, properties and relations. We argue that second-order Markov logic is ideally suited for this purpose, and propose an approach based on it. Our algorithm discovers structural regularities in the source domain in the form of Markov logic formulas with predicate variables, and instantiates these formulas with predicates from the target domain. Our approach has successfully transferred learned knowledge among molecular biology, Web and social network domains.
Unified Treatment of Hidden Markov Switching Models
Several problems encountered in application areas such as finance, biology, speech analysis, control engineering, robotics, etc. require the modeling of time-series containing switching among different dynamics regimes (see Ephraim (2002) for a review). For example, system fault diagnosis deals with detecting behavioural deviations from normality originated by failures in the system. Such a modeling is often achieved by employing probabilistic approaches in which regime switching is described by a set of discrete hidden random variables, related by a first-order Markovian dependence. All such models, that we call hidden Markov switching models (HMSMs), can be viewed as extensions of the popular hidden Markov model Rabiner (1989). The wide interdisciplinary attention to this research area has produced many different HMSMs as well as different approaches and implementations of HMSMs of fundamentally similar structure, resulting in a dense literature from which extracting differences and commonalities among models is often challenging. In this paper we provide a simple unified treatment of existing HMSMs, highlighting properties and connections that were not observed in previous review papers Ephraim (2002); Gales and Young (1993); Murphy (2002); Ostendorf et al. (1996); Rabiner (1989); Yu (2010), and introduce novel extensions. Our exposition enables a deep understanding of the fundamental structure and relations of different approaches. This is achieved by using the framework of graphical models, which allows to easily define complex models by using a graphical representation and to derive efficient inference routines by visual inspection of the graph, avoiding complex algebraic manipulations.
Classification of Sets using Restricted Boltzmann Machines
Louradour, Jérôme, Larochelle, Hugo
We consider the problem of classification when inputs correspond to sets of vectors. This setting occurs in many problems such as the classification of pieces of mail containing several pages, of web sites with several sections or of images that have been pre-segmented into smaller regions. We propose generalizations of the restricted Boltzmann machine (RBM) that are appropriate in this context and explore how to incorporate different assumptions about the relationship between the input sets and the target class within the RBM. In experiments on standard multiple-instance learning datasets, we demonstrate the competitiveness of approaches based on RBMs and apply the proposed variants to the problem of incoming mail classification.
Recognition of Physiological Data for a Motivational Agent
Atrash, Amin Hani (University of Southern California) | Mower, Emily (University of Southern California) | Shams, Khawaja ( University of Southern California ) | Mataric, Maja ( University of Southern California )
Developments in sophisticated mobile physiological sensors have presented many novel opportunities for monitoring coaching of individuals. In this work, we investigate the ability to utilize physiological data to recognize the state ofa user while exercising. We discuss recognition of user state using data suchas heart rate, respiration rate, and activity level. We also discuss the development of a motivational agent which utilizes the physiological data to help encourage a user during an exercise routine.
Efficient Planning under Uncertainty with Macro-actions
He, R., Brunskill, E., Roy, N.
Deciding how to act in partially observable environments remains an active area of research. Identifying good sequences of decisions is particularly challenging when good control performance requires planning multiple steps into the future in domains with many states. Towards addressing this challenge, we present an online, forward-search algorithm called the Posterior Belief Distribution (PBD). PBD leverages a novel method for calculating the posterior distribution over beliefs that result after a sequence of actions is taken, given the set of observation sequences that could be received during this process. This method allows us to efficiently evaluate the expected reward of a sequence of primitive actions, which we refer to as macro-actions. We present a formal analysis of our approach, and examine its performance on two very large simulation experiments: scientific exploration and a target monitoring domain. We also demonstrate our algorithm being used to control a real robotic helicopter in a target monitoring experiment, which suggests that our approach has practical potential for planning in real-world, large partially observable domains where a multi-step lookahead is required to achieve good performance.
Decision Making Agent Searching for Markov Models in Near-Deterministic World
Reinforcement learning has solid foundations, but becomes inefficient in partially observed (non-Markovian) environments. Thus, a learning agent -born with a representation and a policy- might wish to investigate to what extent the Markov property holds. We propose a learning architecture that utilizes combinatorial policy optimization to overcome non-Markovity and to develop efficient behaviors, which are easy to inherit, tests the Markov property of the behavioral states, and corrects against non-Markovity by running a deterministic factored Finite State Model, which can be learned. We illustrate the properties of architecture in the near deterministic Ms. Pac-Man game. We analyze the architecture from the point of view of evolutionary, individual, and social learning.
Online EM Algorithm for Hidden Markov Models
Online (also called "recursive" or "adaptive") estimation of fixed model parameters in hidden Markov models is a topic of much interest in times series modelling. In this work, we propose an online parameter estimation algorithm that combines two key ideas. The first one, which is deeply rooted in the Expectation-Maximization (EM) methodology consists in reparameterizing the problem using complete-data sufficient statistics. The second ingredient consists in exploiting a purely recursive form of smoothing in HMMs based on an auxiliary recursion. Although the proposed online EM algorithm resembles a classical stochastic approximation (or Robbins-Monro) algorithm, it is sufficiently different to resist conventional analysis of convergence. We thus provide limited results which identify the potential limiting points of the recursion as well as the large-sample behavior of the quantities involved in the algorithm. The performance of the proposed algorithm is numerically evaluated through simulations in the case of a noisily observed Markov chain. In this case, the algorithm reaches estimation results that are comparable to that of the maximum likelihood estimator for large sample sizes.
Evidence Feed Forward Hidden Markov Model: A New Type of Hidden Markov Model
DelRose, Michael, Wagner, Christian, Frederick, Philip
The ability to predict the intentions of people based solely on their visual actions is a skill only performed by humans and animals. The intelligence of current computer algorithms has not reached this level of complexity, but there are several research efforts that are working towards it. With the number of classification algorithms available, it is hard to determine which algorithm works best for a particular situation. In classification of visual human intent data, Hidden Markov Models (HMM), and their variants, are leading candidates. The inability of HMMs to provide a probability in the observation to observation linkages is a big downfall in this classification technique. If a person is visually identifying an action of another person, they monitor patterns in the observations. By estimating the next observation, people have the ability to summarize the actions, and thus determine, with pretty good accuracy, the intention of the person performing the action. These visual cues and linkages are important in creating intelligent algorithms for determining human actions based on visual observations. The Evidence Feed Forward Hidden Markov Model is a newly developed algorithm which provides observation to observation linkages. The following research addresses the theory behind Evidence Feed Forward HMMs, provides mathematical proofs of their learning of these parameters to optimize the likelihood of observations with a Evidence Feed Forwards HMM, which is important in all computational intelligence algorithm, and gives comparative examples with standard HMMs in classification of both visual action data and measurement data; thus providing a strong base for Evidence Feed Forward HMMs in classification of many types of problems.