Goto

Collaborating Authors

 Markov Models


Interactive algorithms: from pool to stream

arXiv.org Machine Learning

We consider interactive algorithms in the pool-based setting, and in the stream-based setting. Interactive algorithms observe suggested elements (representing actions or queries), and interactively select some of them and receive responses. Pool-based algorithms can select elements at any order, while stream-based algorithms observe elements in sequence, and can only select elements immediately after observing them. We assume that the suggested elements are generated independently from some source distribution, and ask what is the stream size required for emulating a pool algorithm with a given pool size. We provide algorithms and matching lower bounds for general pool algorithms, and for utility-based pool algorithms. We further show that a maximal gap between the two settings exists also in the special case of active learning for binary classification.


Inferring Sparsity: Compressed Sensing using Generalized Restricted Boltzmann Machines

arXiv.org Machine Learning

In this work, we consider compressed sensing reconstruction from $M$ measurements of $K$-sparse structured signals which do not possess a writable correlation model. Assuming that a generative statistical model, such as a Boltzmann machine, can be trained in an unsupervised manner on example signals, we demonstrate how this signal model can be used within a Bayesian framework of signal reconstruction. By deriving a message-passing inference for general distribution restricted Boltzmann machines, we are able to integrate these inferred signal models into approximate message passing for compressed sensing reconstruction. Finally, we show for the MNIST dataset that this approach can be very effective, even for $M < K$.


[Q] Temporal Difference Learning in POMDP's โ€ข /r/MachineLearning

@machinelearnbot

The environment is partially observable and will never be fully observable, due to a lack of information. Does anyone know of any models suitable for learning such a value function?


Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much

arXiv.org Machine Learning

Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.


Conditional Generation and Snapshot Learning in Neural Dialogue Systems

arXiv.org Machine Learning

Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential signals by applying a companion cross-entropy objective function to the conditioning vector. The experimental and analytical results demonstrate firstly that competition occurs between the conditioning vector and the LM, and the differing architectures provide different trade-offs between the two. Secondly, the discriminative power and transparency of the conditioning vector is key to providing both model interpretability and better performance. Thirdly, snapshot learning leads to consistent performance improvements independent of which architecture is used.


Google RankBrain Algorithm in Digital Marketing

#artificialintelligence

One is going to give a historical overview about GoogleBrain and analyse the pattern, then we will conclude our finding about the current situation and future changes in search engine algorithm. Back in 2006 there were some interests in implementing artificial intelligence in Google search engine algorithm. A few years later in 2014, GoogleBrain was established after acquisition of DeepMind, a British artificial intelligence company which was founded in 2010. They worked on how to play video games based on machine learning and artificial neural networks (ANNs). The smart artificial intelligence revolution can recognize patterns in digital representations of sounds, images and data.


Nonparametric Modeling of Dynamic Functional Connectivity in fMRI Data

arXiv.org Machine Learning

Dynamic functional connectivity (FC) has in recent years become a topic of interest in the neuroimaging community. Several models and methods exist for both functional magnetic resonance imaging (fMRI) and electroencephalography (EEG), and the results point towards the conclusion that FC exhibits dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a non-parametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted in Bayesian statistical modeling we use the predictive likelihood to investigate if the model can discriminate between a motor task and rest both within and across subjects. We further investigate what drives dynamic states using the model on the entire data collated across subjects and task/rest. We find that the number of states extracted are driven by subject variability and preprocessing differences while the individual states are almost purely defined by either task or rest. This questions how we in general interpret dynamic FC and points to the need for more research on what drives dynamic FC.


Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)

arXiv.org Artificial Intelligence

For complex, high-dimensional Markov Decision Processes (MDPs), it may be necessary to represent the policy with function approximation. A problem is misspecified whenever, the representation cannot express any policy with acceptable performance. We introduce IHOMP : an approach for solving misspecified problems. IHOMP iteratively learns a set of context specialized options and combines these options to solve an otherwise misspecified problem. Our main contribution is proving that IHOMP enjoys theoretical convergence guarantees. In addition, we extend IHOMP to exploit Option Interruption (OI) enabling it to decide where the learned options can be reused. Our experiments demonstrate that IHOMP can find near-optimal solutions to otherwise misspecified problems and that OI can further improve the solutions.


Structure Learning in Graphical Modeling

arXiv.org Machine Learning

A graphical model is a statistical model that is associated to a graph whose nodes correspond to variables of interest. The edges of the graph reflect allowed conditional dependencies among the variables. Graphical models admit computationally convenient factorization properties and have long been a valuable tool for tractable modeling of multivariate distributions. More recently, applications such as reconstructing gene regulatory networks from gene expression data have driven major advances in structure learning, that is, estimating the graph underlying a model. We review some of these advances and discuss methods such as the graphical lasso and neighborhood selection for undirected graphical models (or Markov random fields), and the PC algorithm and score-based search methods for directed graphical models (or Bayesian networks).


A Latent-Variable Lattice Model

arXiv.org Machine Learning

Markov random field (MRF) learning is intractable, and its approximation algorithms are computationally expensive. We target a small subset of MRF that is used frequently in computer vision. We characterize this subset with three concepts: Lattice, Homogeneity, and Inertia; and design a non-markov model as an alternative. Our goal is robust learning from small datasets. Our learning algorithm uses vector quantization and, at time complexity O(U log U) for a dataset of U pixels, is much faster than that of general-purpose MRF.