Goto

Collaborating Authors

 Undirected Networks


Exploring Predictive States via Cantor Embeddings and Wasserstein Distance

arXiv.org Artificial Intelligence

Predictive states for stochastic processes are a nonparametric and interpretable construct with relevance across a multitude of modeling paradigms. Recent progress on the self-supervised reconstruction of predictive states from time-series data focused on the use of reproducing kernel Hilbert spaces. Here, we examine how Wasserstein distances may be used to detect predictive equivalences in symbolic data. We compute Wasserstein distances between distributions over sequences ("predictions"), using a finite-dimensional embedding of sequences based on the Cantor for the underlying geometry. We show that exploratory data analysis using the resulting geometry via hierarchical clustering and dimension reduction provides insight into the temporal structure of processes ranging from the relatively simple (e.g., finite-state hidden Markov models) to the very complex (e.g., infinite-state indexed grammars).


Learning in Observable POMDPs, without Computationally Intractable Oracles

arXiv.org Machine Learning

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g. deterministic transitions) or assume access to an oracle for solving a hard optimistic planning or estimation problem as a subroutine. In this work we develop the first oracle-free learning algorithm for POMDPs under reasonable assumptions. Specifically, we give a quasipolynomial-time end-to-end algorithm for learning in "observable" POMDPs, where observability is the assumption that well-separated distributions over states induce well-separated distributions over observations. Our techniques circumvent the more traditional approach of using the principle of optimism under uncertainty to promote exploration, and instead give a novel application of barycentric spanners to constructing policy covers.


Research Papers based on using Machine Learning in DNA Research Domain

#artificialintelligence

Abstract: DNA encoded libraries (DELs) are used for rapid large-scale screening of small molecules against a protein target. These combinatorial libraries are built through several cycles of chemistry and DNA ligation, producing large sets of DNA-tagged molecules. Training machine learning models on DEL data has been shown to be effective at predicting molecules of interest dissimilar from those in the original DEL. Machine learning chemical property prediction approaches rely on the assumption that the property of interest is linked to a single chemical structure. In the context of DNA-encoded libraries, this is equivalent to assuming that every chemical reaction fully yields the desired product.


11 Best Natural Language Processing Online Courses

#artificialintelligence

In this course, you will learn NLP (natural language processing) with deep learning. This course will teach you word2vec and how to implement word2vec. You will also learn how to implement GloVe using gradient descent and alternating least squares. This course uses recurrent neural networks for named entity recognition. Along with that, you will learn how to implement recursive neural tensor networks for sentiment analysis. Let's see the topics covered in this course-


Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL

arXiv.org Artificial Intelligence

Cooperative multi-agent reinforcement learning (MARL) is making rapid progress for solving tasks in a grid world and real-world scenarios, in which agents are given different attributes and goals, resulting in different behavior through the whole multi-agent task. In this study, we quantify the agent's behavior difference and build its relationship with the policy performance via {\bf Role Diversity}, a metric to measure the characteristics of MARL tasks. We define role diversity from three perspectives: action-based, trajectory-based, and contribution-based to fully measure a multi-agent task. Through theoretical analysis, we find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity. The decomposed factors can significantly impact policy optimization on three popular directions including parameter sharing, communication mechanism, and credit assignment. The main experimental platforms are based on {\bf Multiagent Particle Environment (MPE)} and {\bf The StarCraft Multi-Agent Challenge (SMAC). Extensive experiments} clearly show that role diversity can serve as a robust measurement for the characteristics of a multi-agent cooperation task and help diagnose whether the policy fits the current multi-agent system for a better policy performance.


Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

arXiv.org Machine Learning

Reinforcement learning in partially observed Markov decision processes (POMDPs) faces two challenges. (i) It often takes the full history to predict the future, which induces a sample complexity that scales exponentially with the horizon. (ii) The observation and state spaces are often continuous, which induces a sample complexity that scales exponentially with the extrinsic dimension. Addressing such challenges requires learning a minimal but sufficient representation of the observation and state histories by exploiting the structure of the POMDP. To this end, we propose a reinforcement learning algorithm named Embed to Control (ETC), which learns the representation at two levels while optimizing the policy.~(i) For each step, ETC learns to represent the state with a low-dimensional feature, which factorizes the transition kernel. (ii) Across multiple steps, ETC learns to represent the full history with a low-dimensional embedding, which assembles the per-step feature. We integrate (i) and (ii) in a unified framework that allows a variety of estimators (including maximum likelihood estimators and generative adversarial networks). For a class of POMDPs with a low-rank structure in the transition kernel, ETC attains an $O(1/\epsilon^2)$ sample complexity that scales polynomially with the horizon and the intrinsic dimension (that is, the rank). Here $\epsilon$ is the optimality gap. To our best knowledge, ETC is the first sample-efficient algorithm that bridges representation learning and policy optimization in POMDPs with infinite observation and state spaces.


Topological Hidden Markov Models

arXiv.org Machine Learning

The hidden Markov model (HMM) is a classic modeling tool with a wide swath of applications. Its inception considered observations restricted to a finite alphabet, but it was quickly extended to multivariate continuous distributions. In this article, we further extend the HMM from mixtures of normal distributions in $d$-dimensional Euclidean space to general Gaussian measure mixtures in locally convex topological spaces. The main innovation is the use of the Onsager-Machlup functional as a proxy for the probability density function in infinite dimensional spaces. This allows for choice of a Cameron-Martin space suitable for a given application. We demonstrate the versatility of this methodology by applying it to simulated diffusion processes such as Brownian and fractional Brownian sample paths as well as the Ornstein-Uhlenbeck process. Our methodology is applied to the identification of sleep states from overnight polysomnography time series data with the aim of diagnosing Obstructive Sleep Apnea in pediatric patients. It is also applied to a series of annual cumulative snowfall curves from 1940 to 1990 in the city of Edmonton, Alberta.


When Is Partially Observable Reinforcement Learning Not Scary?

arXiv.org Machine Learning

A wide range of modern artificial intelligence challenges ca n be cast as Reinforcement Learning (RL) problems under partial observability, in which agents learn to make a sequence of decisions despite lacking complete information about the underlying state of system. For example, in robotics the agent has to cope with noisy sensors, occlusions, and unk nown dynamics ( Akkaya et al., 2019), while in imperfect information games the player makes only l ocal observations ( Vinyals et al., 2019; Brown and Sandholm, 2019). Further applications of partially observable RL include autonomous driving ( Levinson et al., 2011), resource allocation ( Bower and Gilbert, 2005), medical diagnostic systems ( Hauskrecht and Fraser, 2000), recommendation ( Li et al., 2010), business management ( De Brito and Van Der Laan, 2009), etc. As such, learning and acting under partial observabi lity has been an important topic in operation research, control, and machine learning. Because of the non-Markovian nature of the observations, le arning and planning in partially observable environments requires an agent to maintain memory and possibly reason about beliefs over the states, all while exploring to collect information about the environment. As such, partial observability can significantly complicate learni ng and planning under uncertainty. While practical RL systems have succeeded in a set of partially obs ervable problems including Poker ( Brown and Sandholm, 2019), Starcraft ( Vinyals et al., 2019) and certain robotic tasks ( Cassandra et al., The author emails are {qinghual, alan.chung,


Stereographic Markov Chain Monte Carlo

arXiv.org Machine Learning

High dimensional distributions, especially those with heavy tails, are notoriously difficult for off-the-shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information, and local moves, results in empirically observed "stickiness" and poor theoretical mixing properties -- lack of geometric ergodicity. In this paper, we introduce a new class of MCMC samplers that map the original high dimensional problem in Euclidean space onto a sphere and remedy these notorious mixing problems. In particular, we develop random-walk Metropolis type algorithms as well as versions of Bouncy Particle Sampler that are uniformly ergodic for a large class of light and heavy-tailed distributions and also empirically exhibit rapid convergence in high dimensions. In the best scenario, the proposed samplers can enjoy the ``blessings of dimensionality'' that the mixing time decreases with dimension.


Flow-based Recurrent Belief State Learning for POMDPs

arXiv.org Machine Learning

Partially Observable Markov Decision Process (POMDP) provides a principled and generic framework to model real world sequential decision making processes but yet remains unsolved, especially for high dimensional continuous space and unknown models. The main challenge lies in how to accurately obtain the belief state, which is the probability distribution over the unobservable environment states given historical information. Accurately calculating this belief state is a precondition for obtaining an optimal policy of POMDPs. Recent advances in deep learning techniques show great potential to learn good belief states. However, existing methods can only learn approximated distribution with limited flexibility. In this paper, we introduce the \textbf{F}l\textbf{O}w-based \textbf{R}ecurrent \textbf{BE}lief \textbf{S}tate model (FORBES), which incorporates normalizing flows into the variational inference to learn general continuous belief states for POMDPs. Furthermore, we show that the learned belief states can be plugged into downstream RL algorithms to improve performance. In experiments, we show that our methods successfully capture the complex belief states that enable multi-modal predictions as well as high quality reconstructions, and results on challenging visual-motor control tasks show that our method achieves superior performance and sample efficiency.