Goto

Collaborating Authors

 Markov Models


Artificial Intelligence and the Economy Tackling hearing loss

#artificialintelligence

These models are computer algorithms, or smart apps, that seek to give computers the ability to learn like children for a variety of tasks. Here, we highlight how an author's work may solve a particular set of real-world tasks or problems. By doing this, we aim to foster more and more machine, learning works, to be done by more and more Jamaican people. Today, we'll highlight the machine-learning work, a paper/algorithm called'Modelling Sensorineural Hearing-impaired Listeners' Perception of Speaker Intelligibility in Noise", by UWI lecturers Dr Lindon W. Falconer, Dr Andrรˆ Coy, and their overseas colleague, Professor Jon Barker. Jordan: How would you describe your work? Dr Coy, et al: Disabling hearing loss is a major challenge faced by many individuals in societies throughout the world. The World Health Organization (WHO) has reported that approximately 6.1 per cent of the world's population has disabling hearing loss, and about 93 per cent of these people are adults.


Learning to Speed Up Structured Output Prediction

arXiv.org Machine Learning

Predicting structured outputs can be computationally onerous due to the combinatorially large output spaces. In this paper, we focus on reducing the prediction time of a trained black-box structured classifier without losing accuracy. To do so, we train a speedup classifier that learns to mimic a black-box classifier under the learning-to-search approach. As the structured classifier predicts more examples, the speedup classifier will operate as a learned heuristic to guide search to favorable regions of the output space. We present a mistake bound for the speedup classifier and identify inference situations where it can independently make correct judgments without input features. We evaluate our method on the task of entity and relation extraction and show that the speedup classifier outperforms even greedy search in terms of speed without loss of accuracy.


An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning

arXiv.org Artificial Intelligence

Our goal is for AI systems to correctly identify and act according to their human user's objectives. Cooperative Inverse Reinforcement Learning (CIRL) formalizes this value alignment problem as a two-player game between a human and robot, in which only the human knows the parameters of the reward function: the robot needs to learn them as the interaction unfolds. Previous work showed that CIRL can be solved as a POMDP, but with an action space size exponential in the size of the reward parameter space. In this work, we exploit a specific property of CIRL---the human is a full information agent---to derive an optimality-preserving modification to the standard Bellman update; this reduces the complexity of the problem by an exponential factor and allows us to relax CIRL's assumption of human rationality. We apply this update to a variety of POMDP solvers and find that it enables us to scale CIRL to non-trivial problems, with larger reward parameter spaces, and larger action spaces for both robot and human. In solutions to these larger problems, the human exhibits pedagogic (teaching) behavior, while the robot interprets it as such and attains higher value for the human.


Stationary Geometric Graphical Model Selection

arXiv.org Machine Learning

We consider the problem of model selection in Gaussian Markov fields in the sample deficient scenario. In many cases, the underlying networks are embedded into Euclidean spaces which induces significant structure on them. Using this natural spatial structure, we introduce the notion of spatially stationary distributions over geometric graphs directly generalizing the notion of stationary time series to the multidimensional setup lacking time axis. We show that the idea of spatial stationarity leads to a dramatic decrease in the sample complexity of the model selection compared to abstract graphs with the same level of sparsity. For geometric graphs on randomly spread vertices and edges of bounded length, we develop tight information-theoretic bounds on the sample complexity and show that a finite number of independent samples is sufficient for a consistent recovery. Finally, we develop an efficient technique capable of reliably and consistently reconstructing graphs with a bounded number of measurements. Markov random fields, or undirected probabilistic graphical models, provide a structured representation of the joint distributions of families of random variables.


Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction

arXiv.org Artificial Intelligence

Future predictions on sequence data (e.g., videos or audios) require the algorithms to capture non-Markovian and compositional properties of high-level semantics. Context-free grammars are natural choices to capture such properties, but traditional grammar parsers (e.g., Earley parser) only take symbolic sentences as inputs. In this paper, we generalize the Earley parser to parse sequence data which is neither segmented nor labeled. This generalized Earley parser integrates a grammar parser with a classifier to find the optimal segmentation and labels, and makes top-down future predictions. Experiments show that our method significantly outperforms other approaches for future human activity prediction.


Learning in Integer Latent Variable Models with Nested Automatic Differentiation

arXiv.org Machine Learning

We develop nested automatic differentiation (AD) algorithms for exact inference and learning in integer latent variable models. Recently, Winner, Sujono, and Sheldon showed how to reduce marginalization in a class of integer latent variable models to evaluating a probability generating function which contains many levels of nested high-order derivatives. We contribute faster and more stable AD algorithms for this challenging problem and a novel algorithm to compute exact gradients for learning. These contributions lead to significantly faster and more accurate learning algorithms, and are the first AD algorithms whose running time is polynomial in the number of levels of nesting.


Localized Structured Prediction

arXiv.org Machine Learning

Key to structured prediction is exploiting the problem structure to simplify the learning process. A major challenge arises when data exhibit a local structure (e.g., are made by "parts") that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, has shown that capturing these aspects is indeed essential to achieve state-of-the-art performance. While such algorithms are typically derived on a case-by-case basis, in this work we propose the first theoretical framework to deal with part-based data from a general perspective. We derive a novel approach to deal with these problems and study its generalization properties within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of the problem with respect to the learning rates of the proposed estimator.


How would you explain Markov Chain Monte Carlo (MCMC) to a layperson?

#artificialintelligence

First, we need to understand what is a Markov chain. Consider the following weather example from Wikipedia. Suppose that weather on any given day can be classified into two states only: sunny and rainy. Since, the next day's weather is either sunny or rainy it follows that: Q1: If the weather is sunny today then what is the weather likely to be tomorrow? A1: Since, we do not know what is going to happen for sure, the best we can say is that there is a $90\%$ chance that it is likely to be sunny and $10\%$ that it will be rainy.


Semi-Supervised Learning via Compact Latent Space Clustering

arXiv.org Machine Learning

We present a novel cost function for semi-supervised learning of neural networks that encourages compact clustering of the latent space to facilitate separation. The key idea is to dynamically create a graph over embeddings of labeled and unlabeled samples of a training batch to capture underlying structure in feature space, and use label propagation to estimate its high and low density regions. We then devise a cost function based on Markov chains on the graph that regularizes the latent space to form a single compact cluster per class, while avoiding to disturb existing clusters during optimization. We evaluate our approach on three benchmarks and compare to state-of-the art with promising results. Our approach combines the benefits of graph-based regularization with efficient, inductive inference, does not require modifications to a network architecture, and can thus be easily applied to existing networks to enable an effective use of unlabeled data.


Program Synthesis Through Reinforcement Learning Guided Tree Search

arXiv.org Artificial Intelligence

Program Synthesis is the task of generating a program from a provided specification. Traditionally, this has been treated as a search problem by the programming languages (PL) community and more recently as a supervised learning problem by the machine learning community. Here, we propose a third approach, representing the task of synthesizing a given program as a Markov decision process solvable via reinforcement learning(RL). From observations about the states of partial programs, we attempt to find a program that is optimal over a provided reward metric on pairs of programs and states. We instantiate this approach on a subset of the RISC-V assembly language operating on floating point numbers, and as an optimization inspired by search-based techniques from the PL community, we combine RL with a priority search tree. We evaluate this instantiation and demonstrate the effectiveness of our combined method compared to a variety of baselines, including a pure RL ablation and a state of the art Markov chain Monte Carlo search method on this task.