Goto

Collaborating Authors

 Undirected Networks


A Survey on Recent Advances in Sequence Labeling from Deep Learning Models

arXiv.org Artificial Intelligence

Sequence labeling (SL) is a fundamental research problem encompassing a variety of tasks, e.g., part-of-speech (POS) tagging, named entity recognition (NER), text chunking, etc. Though prevalent and effective in many downstream applications (e.g., information retrieval, question answering, and knowledge graph embedding), conventional sequence labeling approaches heavily rely on hand-crafted or language-specific features. Recently, deep learning has been employed for sequence labeling tasks due to its powerful capability in automatically learning complex features of instances and effectively yielding the stat-of-the-art performances. In this paper, we aim to present a comprehensive review of existing deep learning-based sequence labeling models, which consists of three related tasks, e.g., part-of-speech tagging, named entity recognition, and text chunking. Then, we systematically present the existing approaches base on a scientific taxonomy, as well as the widely-used experimental datasets and popularly-adopted evaluation metrics in the SL domain. Furthermore, we also present an in-depth analysis of different SL models on the factors that may affect the performance and future directions in the SL domain.


Active Reinforcement Learning: Observing Rewards at a Cost

arXiv.org Artificial Intelligence

Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0. The central question of ARL is how to quantify the long-term value of reward information. Even in multi-armed bandits, computing the value of this information is intractable and we have to rely on heuristics. We propose and evaluate several heuristic approaches for ARL in multi-armed bandits and (tabular) Markov decision processes, and discuss and illustrate some challenging aspects of the ARL problem.


Deep Learning: Recurrent Neural Networks in Python

#artificialintelligence

Created by Lazy Programmer Inc. English [Auto-generated], Indonesian [Auto-generated], 5 more Created by Lazy Programmer Inc. Like the course I just released on Hidden Markov Models, Recurrent Neural Networks are all about learning sequences - but whereas Markov Models are limited by the Markov assumption, Recurrent Neural Networks are not - and as a result, they are more expressive, and more powerful than anything we've seen on tasks that we haven't made progress on in decades. So what's going to be in this course and how will it build on the previous neural network courses and Hidden Markov Models? In the first section of the course we are going to add the concept of time to our neural networks. I'll introduce you to the Simple Recurrent Unit, also known as the Elman unit.


Algorithms in Reinforcement Learning

#artificialintelligence

To learn the optimal action in unknown environment, Q-learning is the simple algorithm in reinforcement learning. Without having a model of an environment, it can learn the optimal and long-term action. And there have two policies called target policy and behavior policy. Tabular methods give correct policies and functions in tables. In Q-learning, to find optimal action value function, behavior policy can be achieved using policy iteration.


Exploratory Grasping: Asymptotically Optimal Algorithms for Grasping Challenging Polyhedral Objects

arXiv.org Artificial Intelligence

There has been significant recent work on data-driven algorithms for learning general-purpose grasping policies. However, these policies can consistently fail to grasp challenging objects which are significantly out of the distribution of objects in the training data or which have very few high quality grasps. Motivated by such objects, we propose a novel problem setting, Exploratory Grasping, for efficiently discovering reliable grasps on an unknown polyhedral object via sequential grasping, releasing, and toppling. We formalize Exploratory Grasping as a Markov Decision Process, study the theoretical complexity of Exploratory Grasping in the context of reinforcement learning and present an efficient bandit-style algorithm, Bandits for Online Rapid Grasp Exploration Strategy (BORGES), which leverages the structure of the problem to efficiently discover high performing grasps for each object stable pose. BORGES can be used to complement any general-purpose grasping algorithm with any grasp modality (parallel-jaw, suction, multi-fingered, etc) to learn policies for objects in which they exhibit persistent failures. Simulation experiments suggest that BORGES can significantly outperform both general-purpose grasping pipelines and two other online learning algorithms and achieves performance within 5% of the optimal policy within 1000 and 8000 timesteps on average across 46 challenging objects from the Dex-Net adversarial and EGAD! object datasets, respectively. Initial physical experiments suggest that BORGES can improve grasp success rate by 45% over a Dex-Net baseline with just 200 grasp attempts in the real world. See https://tinyurl.com/exp-grasping for supplementary material and videos.


Using Machine Learning for Decreasing State Uncertainty in Planning

Journal of Artificial Intelligence Research

We present a novel approach for decreasing state uncertainty in planning prior to solving the planning problem. This is done by making predictions about the state based on currently known information, using machine learning techniques. For domains where uncertainty is high, we define an active learning process for identifying which information, once sensed, will best improve the accuracy of predictions. We demonstrate that an agent is able to solve problems with uncertainties in the state with less planning effort compared to standard planning techniques. Moreover, agents can solve problems for which they could not find valid plans without using predictions. Experimental results also demonstrate that using our active learning process for identifying information to be sensed leads to gathering information that improves the prediction process.


Generalized Constraints as A New Mathematical Problem in Artificial Intelligence: A Review and Perspective

arXiv.org Artificial Intelligence

In this comprehensive review, we describe a new mathematical problem in artificial intelligence (AI) from a mathematical modeling perspective, following the philosophy stated by Rudolf E. Kalman that "Once you get the physics right, the rest is mathematics". The new problem is called "Generalized Constraints (GCs)", and we adopt GCs as a general term to describe any type of prior information in modelings. To understand better about GCs to be a general problem, we compare them with the conventional constraints (CCs) and list their extra challenges over CCs. In the construction of AI machines, we basically encounter more often GCs for modeling, rather than CCs with well-defined forms. Furthermore, we discuss the ultimate goals of AI and redefine transparent, interpretable, and explainable AI in terms of comprehension levels about machines. We review the studies in relation to the GC problems although most of them do not take the notion of GCs. We demonstrate that if AI machines are simplified by a coupling with both knowledge-driven submodel and data-driven submodel, GCs will play a critical role in a knowledge-driven submodel as well as in the coupling form between the two submodels. Examples are given to show that the studies in view of a generalized constraint problem will help us perceive and explore novel subjects in AI, or even in mathematics, such as generalized constraint learning (GCL).


A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis

arXiv.org Machine Learning

Safe reinforcement learning (SRL) problems are typically modeled as constrained Markov Decision Process (CMDP), in which an agent explores the environment to maximize the expected total reward and meanwhile avoids violating certain constraints on a number of expected total costs. In general, such SRL problems have nonconvex objective functions subject to multiple nonconvex constraints, and hence are very challenging to solve, particularly to provide a globally optimal policy. Many popular SRL algorithms adopt a primal-dual structure which utilizes the updating of dual variables for satisfying the constraints. In contrast, we propose a primal approach, called constraint-rectified policy optimization (CRPO), which updates the policy alternatingly between objective improvement and constraint satisfaction. CRPO provides a primal-type algorithmic framework to solve SRL problems, where each policy update can take any variant of policy optimization step. To demonstrate the theoretical performance of CRPO, we adopt natural policy gradient (NPG) for each policy update step and show that CRPO achieves an $\mathcal{O}(1/\sqrt{T})$ convergence rate to the global optimal policy in the constrained policy set and an $\mathcal{O}(1/\sqrt{T})$ error bound on constraint satisfaction. This is the first finite-time analysis of SRL algorithms with global optimality guarantee. Our empirical results demonstrate that CRPO can outperform the existing primal-dual baseline algorithms significantly.


A Nonconvex Framework for Structured Dynamic Covariance Recovery

arXiv.org Machine Learning

Dynamic covariance models appear prominently in the analysis of time-series data and can provide fundamental insights into complex systems. They are important for applications ranging from computational finance and economics (Engle et al., 2019) to epidemiology (Fox and Dunson, 2015) and neuroscience (Foti and Fox, 2019). Our work is motivated by the study of dynamic functional brain network connectivity, defined as the time-indexed covariance of brain activity. Improving understanding of brain function is timely, as neurological and neuropsychiatric disorders ranging from Schizophrenia, Autism, and Alzheimer's to depression create burdens on affected individuals. To this end, understanding the variation of brain connectivity between individuals is believed to be a crucial step towards uncovering the mechanisms of neural information processing (Sakoğlu et al., 2010; Chang et al., 2016), with potentially transformative applications to understanding and treating neurological and neuropsychiatric disorders (Fox and Raichle, 2007; Calhoun et al., 2014). There are a variety of approaches for estimating dynamic covariances in the neuroscience literature. Discrete-state hidden Markov models construct simple explanations of brain connectivity in terms of recurring connectivity patterns (Vidaurre et al., 2017), and yet they fail to capture the smooth nature of brain dynamics (Shine et al., 2016a,b).


An Embedded Model Estimator for Non-Stationary Random Functions using Multiple Secondary Variables

arXiv.org Machine Learning

An algorithm for non-stationary spatial modelling using multiple secondary variables is developed. It combines Geostatistics with Quantile Random Forests to give a new interpolation and stochastic simulation algorithm. This paper introduces the method and shows that it has consistency results that are similar in nature to those applying to geostatistical modelling and to Quantile Random Forests. The method allows for embedding of simpler interpolation techniques, such as Kriging, to further condition the model. The algorithm works by estimating a conditional distribution for the target variable at each target location. The family of such distributions is called the envelope of the target variable. From this, it is possible to obtain spatial estimates, quantiles and uncertainty. An algorithm to produce conditional simulations from the envelope is also developed. As they sample from the envelope, realizations are therefore locally influenced by relative changes of importance of secondary variables, trends and variability.