Goto

Collaborating Authors

 Undirected Networks


Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models

arXiv.org Artificial Intelligence

While discrete latent variable models have had great success in self-supervised learning, most models assume that frames are independent. Due to the segmental nature of phonemes in speech perception, modeling dependencies among latent variables at the frame level can potentially improve the learned representations on phonetic-related tasks. In this work, we assume Markovian dependencies among latent variables, and propose to learn speech representations with neural hidden Markov models. Our general framework allows us to compare to self-supervised models that assume independence, while keeping the number of parameters fixed. The added dependencies improve the accessibility of phonetic information, phonetic segmentation, and the cluster purity of phones, showcasing the benefit of the assumed dependencies.


LEADER: Learning Attention over Driving Behaviors for Planning under Uncertainty

arXiv.org Artificial Intelligence

Uncertainty on human behaviors poses a significant challenge to autonomous driving in crowded urban environments. The partially observable Markov decision processes (POMDPs) offer a principled framework for planning under uncertainty, often leveraging Monte Carlo sampling to achieve online performance for complex tasks. However, sampling also raises safety concerns by potentially missing critical events. To address this, we propose a new algorithm, LEarning Attention over Driving bEhavioRs (LEADER), that learns to attend to critical human behaviors during planning. LEADER learns a neural network generator to provide attention over human behaviors in real-time situations. It integrates the attention into a belief-space planner, using importance sampling to bias reasoning towards critical events. To train the algorithm, we let the attention generator and the planner form a min-max game. By solving the min-max game, LEADER learns to perform risk-aware planning without human labeling.


Pre-Trained Language Models for Interactive Decision-Making

arXiv.org Artificial Intelligence

Language model (LM) pre-training is useful in many language processing tasks. But can pre-trained LMs be further leveraged for more general machine learning problems? We propose an approach for using LMs to scaffold learning and generalization in general sequential decision-making problems. In this approach, goals and observations are represented as a sequence of embeddings, and a policy network initialized with a pre-trained LM predicts the next action. We demonstrate that this framework enables effective combinatorial generalization across different environments and supervisory modalities. We begin by assuming access to a set of expert demonstrations, and show that initializing policies with LMs and fine-tuning them via behavior cloning improves task completion rates by 43.6% in the VirtualHome environment. Next, we integrate an active data gathering procedure in which agents iteratively interact with the environment, relabel past "failed" experiences with new goals, and update their policies in a self-supervised loop. Active data gathering further improves combinatorial generalization, outperforming the best baseline by 25.1%. Finally, we explain these results by investigating three possible factors underlying the effectiveness of the LM-based policy. We find that sequential input representations (vs. fixed-dimensional feature vectors) and LM-based weight initialization are both important for generalization. Surprisingly, however, the format of the policy inputs encoding (e.g. as a natural language string vs. an arbitrary sequential encoding) has little influence. Together, these results suggest that language modeling induces representations that are useful for modeling not just language, but also goals and plans; these representations can aid learning and generalization even outside of language processing.


Minimax Estimation and Identity Testing of Markov Chains

#artificialintelligence

A fundamental problem in statistics is to estimate a probability distribution from independent samples. In order to make the question precise, we choose a notion of distance \(\rho\) between distributions, pick two small numbers \(\delta, \varepsilon 0\) and can for instance say that an estimator is good, when with high probability \(1 - \delta\) over the random choice of the sample, the estimator is \(\varepsilon\) close to the true distribution. The problem of determining \(n_0\) is then typically addressed by providing two distinct answers. On one hand, we construct an estimator that would be good for any distribution given that \(n n_0 {UB}\). Conversely, we set up a hard problem such that no estimator can be good when \(n n_0 {LB}\).


A Multilevel Reinforcement Learning Framework for PDE-based Control

arXiv.org Artificial Intelligence

Reinforcement learning (RL) is a promising method to solve control problems. However, model-free RL algorithms are sample inefficient and require thousands if not millions of samples to learn optimal control policies. A major source of computational cost in RL corresponds to the transition function, which is dictated by the model dynamics. This is especially problematic when model dynamics is represented with coupled PDEs. In such cases, the transition function often involves solving a large-scale discretization of the said PDEs. We propose a multilevel RL framework in order to ease this cost by exploiting sublevel models that correspond to coarser scale discretization (i.e. multilevel models). This is done by formulating an approximate multilevel Monte Carlo estimate of the objective function of the policy and / or value network instead of Monte Carlo estimates, as done in the classical framework. As a demonstration of this framework, we present a multilevel version of the proximal policy optimization (PPO) algorithm. Here, the level refers to the grid fidelity of the chosen simulation-based environment. We provide two examples of simulation-based environments that employ stochastic PDEs that are solved using finite-volume discretization. For the case studies presented, we observed substantial computational savings using multilevel PPO compared to its classical counterpart.


An Artificial Intelligence driven Learning Analytics Method to Examine the Collaborative Problem solving Process from a Complex Adaptive Systems Perspective

arXiv.org Artificial Intelligence

Collaborative problem solving (CPS) enables student groups to complete learning tasks, construct knowledge, and solve problems. Previous research has argued the importance to examine the complexity of CPS, including its multimodality, dynamics, and synergy from the complex adaptive systems perspective. However, there is limited empirical research examining the adaptive and temporal characteristics of CPS which might lead to an oversimplified representation of the real complexity of the CPS process. To further understand the nature of CPS in online interaction settings, this research collected multimodal process and performance data (i.e., verbal audios, computer screen recordings, concept map data) and proposed a three-layered analytical framework that integrated AI algorithms with learning analytics to analyze the regularity of groups collaboration patterns. The results detected three types of collaborative patterns in groups, namely the behaviour-oriented collaborative pattern (Type 1) associated with medium-level performance, the communication - behaviour - synergistic collaborative pattern (Type 2) associated with high-level performance, and the communication-oriented collaborative pattern (Type 3) associated with low-level performance. The research further highlighted the multimodal, dynamic, and synergistic characteristics of groups collaborative patterns to explain the emergence of an adaptive, self-organizing system during the CPS process.


Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

arXiv.org Artificial Intelligence

We consider the task of estimating a structural model of dynamic decisions by a human agent based upon the observable history of implemented actions and visited states. This problem has an inherent nested structure: in the inner problem, an optimal policy for a given reward function is identified while in the outer problem, a measure of fit is maximized. Several approaches have been proposed to alleviate the computational burden of this nested-loop structure, but these methods still suffer from high complexity when the state space is either discrete with large cardinality or continuous in high dimensions. Other approaches in the inverse reinforcement learning (IRL) literature emphasize policy estimation at the expense of reduced reward estimation accuracy. In this paper we propose a single-loop estimation algorithm with finite time guarantees that is equipped to deal with high-dimensional state spaces without compromising reward estimation accuracy. In the proposed algorithm, each policy improvement step is followed by a stochastic gradient step for likelihood maximization. We show that the proposed algorithm converges to a stationary solution with a finite-time guarantee. Further, if the reward is parameterized linearly, we show that the algorithm approximates the maximum likelihood estimator sublinearly. Finally, by using robotics control problems in MuJoCo and their transfer settings, we show that the proposed algorithm achieves superior performance compared with other IRL and imitation learning benchmarks.


Mapping Husserlian phenomenology onto active inference

arXiv.org Artificial Intelligence

Phenomenology is the rigorous descriptive study of conscious experience. Recent attempts to formalize Husserlian phenomenology provide us with a mathematical model of perception as a function of prior knowledge and expectation. In this paper, we re-examine elements of Husserlian phenomenology through the lens of active inference. In doing so, we aim to advance the project of computational phenomenology, as recently outlined by proponents of active inference. We propose that key aspects of Husserl's descriptions of consciousness can be mapped onto aspects of the generative models associated with the active inference approach. We first briefly review active inference. We then discuss Husserl's phenomenology, with a focus on time consciousness. Finally, we present our mapping from Husserlian phenomenology to active inference.


Multi-layered Discriminative Restricted Boltzmann Machine with Untrained Probabilistic Layer

arXiv.org Artificial Intelligence

An extreme learning machine (ELM) is a three-layered feed-forward neural network having untrained parameters, which are randomly determined before training. Inspired by the idea of ELM, a probabilistic untrained layer called a probabilistic-ELM (PELM) layer is proposed, and it is combined with a discriminative restricted Boltzmann machine (DRBM), which is a probabilistic three-layered neural network for solving classification problems. The proposed model is obtained by stacking DRBM on the PELM layer. The resultant model (i.e., multi-layered DRBM (MDRBM)) forms a probabilistic four-layered neural network. In MDRBM, the parameters in the PELM layer can be determined using Gaussian-Bernoulli restricted Boltzmann machine. Owing to the PELM layer, MDRBM obtains a strong immunity against noise in inputs, which is one of the most important advantages of MDRBM. Numerical experiments using some benchmark datasets, MNIST, Fashion-MNIST, Urban Land Cover, and CIFAR-10, demonstrate that MDRBM is superior to other existing models, particularly, in terms of the noise-robustness property (or, in other words, the generalization property).


Self-Supervised Speech Representation Learning: A Review

arXiv.org Artificial Intelligence

Although supervised deep learning has revolutionized speech and audio processing, it has necessitated the building of specialist models for individual tasks and application scenarios. It is likewise difficult to apply this to dialects and languages for which only limited labeled data is available. Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains. Such methods have shown success in natural language processing and computer vision domains, achieving new levels of performance while reducing the number of labels required for many downstream scenarios. Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods. Other approaches rely on multi-modal data for pre-training, mixing text or visual data streams with speech. Although self-supervised speech representation is still a nascent research area, it is closely related to acoustic word embedding and learning with zero lexical resources, both of which have seen active research for many years. This review presents approaches for self-supervised speech representation learning and their connection to other research areas. Since many current methods focus solely on automatic speech recognition as a downstream task, we review recent efforts on benchmarking learned representations to extend the application beyond speech recognition.