Goto

Collaborating Authors

 Undirected Networks


Global SNR Estimation of Speech Signals using Entropy and Uncertainty Estimates from Dropout Networks

arXiv.org Artificial Intelligence

This paper demonstrates two novel methods to estimate the global SNR of speech signals. In both methods, Deep Neural Network-Hidden Markov Model (DNN-HMM) acoustic model used in speech recognition systems is leveraged for the additional task of SNR estimation. In the first method, the entropy of the DNN-HMM output is computed. Recent work on bayesian deep learning has shown that a DNN-HMM trained with dropout can be used to estimate model uncertainty by approximating it as a deep Gaussian process. In the second method, this approximation is used to obtain model uncertainty estimates. Noise specific regressors are used to predict the SNR from the entropy and model uncertainty. The DNN-HMM is trained on GRID corpus and tested on different noise profiles from the DEMAND noise database at SNR levels ranging from -10 dB to 30 dB.


[D] Simple intro to Markov Decision Process via Game of Thorns โ€ข r/MachineLearning

@machinelearnbot

Ok, I just skimmed through the video but why on earth did you get the names of key places and cities right, but messed up the title? Like those fancy chairs kings sit on. Not the spiky things that grow in the wilderness. You don't want to sit on those.


Symbol Grounding Association in Multimodal Sequences with Missing Elements

Journal of Artificial Intelligence Research

In this paper, we extend a symbolic association framework for being able to handle missing elements in multimodal sequences. The general scope of the work is the symbolic associations of object-word mappings as it happens in language development in infants. In other words, two different representations of the same abstract concepts can associate in both directions. This scenario has been long interested in Artificial Intelligence, Psychology, and Neuroscience. In this work, we extend a recent approach for multimodal sequences (visual and audio) to also cope with missing elements in one or both modalities. Our method uses two parallel Long Short-Term Memories (LSTMs) with a learning rule based on EM-algorithm. It aligns both LSTM outputs via Dynamic Time Warping (DTW). We propose to include an extra step for the combination with the max operation for exploiting the common elements between both sequences. The motivation behind is that the combination acts as a condition selector for choosing the best representation from both LSTMs. We evaluated the proposed extension in the following scenarios: missing elements in one modality (visual or audio) and missing elements in both modalities (visual and sound). The performance of our extension reaches better results than the original model and similar results to individual LSTM trained in each modality.


Clustering Based Unsupervised Learning โ€“ Towards Data Science

#artificialintelligence

Unsupervised machine learning is the machine learning task of inferring a function to describe hidden structure from "unlabeled" data (a classification or categorization is not included in the observations). While there is an exhaustive list of clustering algorithms available (whether you use R or Python's Scikit-Learn), I will attempt to cover the basic concepts. The most common and simplest clustering algorithm out there is the K-Means clustering. This algorithms involve you telling the algorithms how many possible cluster (or K) there are in the dataset. The algorithm then iteratively moves the k-centers and selects the datapoints that are closest to that centroid in the cluster.


Path to Stochastic Stability: Comparative Analysis of Stochastic Learning Dynamics in Games

arXiv.org Machine Learning

Stochastic stability is a popular solution concept for stochastic learning dynamics in games. However, a critical limitation of this solution concept is its inability to distinguish between different learning rules that lead to the same steady-state behavior. We address this limitation for the first time and develop a framework for the comparative analysis of stochastic learning dynamics with different update rules but same steady-state behavior. We present the framework in the context of two learning dynamics: Log-Linear Learning (LLL) and Metropolis Learning (ML). Although both of these dynamics have the same stochastically stable states, LLL and ML correspond to different behavioral models for decision making. Moreover, we demonstrate through an example setup of sensor coverage game that for each of these dynamics, the paths to stochastically stable states exhibit distinctive behaviors. Therefore, we propose multiple criteria to analyze and quantify the differences in the short and medium run behavior of stochastic learning dynamics. We derive and compare upper bounds on the expected hitting time to the set of Nash equilibria for both LLL and ML. For the medium to long-run behavior, we identify a set of tools from the theory of perturbed Markov chains that result in a hierarchical decomposition of the state space into collections of states called cycles. We compare LLL and ML based on the proposed criteria and develop invaluable insights into the comparative behavior of the two dynamics.


Discontinuous Hamiltonian Monte Carlo for Probabilistic Programs

arXiv.org Machine Learning

Hamiltonian Monte Carlo (HMC) is the dominant statistical inference algorithm used in most popular "first-order differentiable" probabilisticprogramming languages. HMC requires that the joint density be differentiable with respect to all latent variables. This complicates expressing some models in such languages and prohibits others. A recently proposed new integrator for HMC yielded a new Discontinuous HMC (DHMC) algorithm that can be used for inference in models with joint densities that have discontinuities. In this paper we show how to use DHMC for inference in probabilistic programs. To do this we introduce a sufficient set of language restrictions, a corresponding mathematical formalism that ensures that any joint density denoted in such a language has a suitably low measure of discontinuous points, and a recipe for how to apply DHMC in the more general probabilistic-programming context. Our experimental findings demonstrate the correctness of this approach.


Generalized Dual Decomposition for Bounding Maximum Expected Utility of Influence Diagrams with Perfect Recall

AAAI Conferences

We introduce a generalized dual decomposition bound for computing the maximum expected utility of influence diagrams based on the dual decomposition method generalized to $L^p$ space.ย  The main goal is to devise an approximation scheme free from translations required by existing variational approaches while exploiting the local structure of sum of utility functions as well as the conditional independence of probability functions.ย  In this work, the generalized dual decomposition method is applied to the algebraic framework called valuation algebra for influence diagrams which handles probability and expected utility as a pair. The proposed approach allows a sequential decision problem to be decomposed as a collection of sub-decision problems of bounded complexity and the upper bound of maximum expected utility to be computed by combining the local expected utilities. Thus, it has a flexible control of space and time complexity for computing the bound.ย  In addition, the upper bounds can be further minimized by reparameterizing the utility functions. Since the global objective function for the minimization is nonconvex, we present a gradient-based local search algorithm in which the outer loop controls the randomization of the initial configurations and the inner loop tightens the upper-bound based on block coordinate descent with gradients perturbed by a random noise. The experimental evaluation demonstrates highlights of the proposed approach on finite horizon MDP/POMDP instances.


Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs

AAAI Conferences

Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment, extending POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDP, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo (SMC) sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of other agents and has superior performance when compared to other methods. Our approach serves as a generalized reinforcement learning algorithm that learns other agents' beliefs, and transition, observation and reward functions. It also effectively mitigates the belief space complexity due to the nested belief hierarchy.


Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

arXiv.org Machine Learning

Motivation: Biomedical named entity recognition (BioNER) is the most fundamental task in biomedical text mining. State-of-the-art BioNER systems often require handcrafted features specifically designed for each type of biomedical entities. This feature generation process requires intensive labors from biomedical and linguistic experts, and makes it difficult to adapt these systems to new biomedical entity types. Although recent studies explored using neural network models for BioNER to free experts from manual feature generation, these models still require substantial human efforts to annotate massive training data. Results: We propose a multi-task learning framework for BioNER that is based on neural network models to save human efforts. We build a global model by collectively training multiple models that share parameters, each model capturing the characteristics of a different biomedical entity type. In experiments on five BioNER benchmark datasets covering four major biomedical entity types, our model outperforms state-of-the-art systems and other neural network models by a large margin, even when only limited training data are available. Further analysis shows that the large performance gains come from sharing character- and word-level information between different biomedical entities. The approach creates new opportunities for text-mining approaches to help biomedical scientists better exploit knowledge in biomedical literature.


An Imprecise Probabilistic Estimator for the Transition Rate Matrix of a Continuous-Time Markov Chain

arXiv.org Machine Learning

We consider the problem of estimating the transition rate matrix of a continuous-time Markov chain from a finite-duration realisation of this process. We approach this problem in an imprecise probabilistic framework, using a set of prior distributions on the unknown transition rate matrix. The resulting estimator is a set of transition rate matrices that, for reasons of conjugacy, is easy to find. To determine the hyperparameters for our set of priors, we reconsider the problem in discrete time, where we can use the well-known Imprecise Dirichlet Model. In particular, we show how the limit of the resulting discrete-time estimators is a continuous-time estimator. It corresponds to a specific choice of hyperparameters and has an exceptionally simple closed-form expression.