AITopics

1811.12929

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

arXiv.org Artificial IntelligenceNov-30-2018

Flexible and Scalable State Tracking Framework for Goal-Oriented Dialogue Systems

Goel, Rahul, Paul, Shachi, Chung, Tagyoung, Lecomte, Jeremie, Mandal, Arindam, Hakkani-Tur, Dilek

Goal-oriented dialogue systems typically rely on components specifically developed for a single task or domain. This limits such systems in two different ways: If there is an update in the task domain, the dialogue system usually needs to be updated or completely re-trained. It is also harder to extend such dialogue systems to different and multiple domains. The dialogue state tracker in conventional dialogue systems is one such component - it is usually designed to fit a well-defined application domain. For example, it is common for a state variable to be a categorical distribution over a manually-predefined set of entities (Henderson et al., 2013), resulting in an inflexible and hard-to-extend dialogue system. In this paper, we propose a new approach for dialogue state tracking that can generalize well over multiple domains without incorporating any domain-specific knowledge. Under this framework, discrete dialogue state variables are learned independently and the information of a predefined set of possible values for dialogue state variables is not required. Furthermore, it enables adding arbitrary dialogue context as features and allows for multiple values to be associated with a single state variable. These characteristics make it much easier to expand the dialogue state space. We evaluate our framework using the widely used dialogue state tracking challenge data set (DSTC2) and show that our framework yields competitive results with other state-of-the-art results despite incorporating little domain knowledge. We also show that this framework can benefit from widely available external resources such as pre-trained word embeddings.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1811.12891

Country: North America (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Simulated Tempering Langevin Monte Carlo II: An Improved Proof using Soft Markov Chain Decomposition

Ge, Rong, Lee, Holden, Risteski, Andrej

A key task in Bayesian machine learning is sampling from distributions that are only specified up to a partition function (i.e., constant of proportionality). One prevalent example of this is sampling posteriors in parametric distributions, such as latent-variable generative models. However sampling (even very approximately) can be #P-hard. Classical results going back to Bakry and \'Emery (1985) on sampling focus on log-concave distributions, and show a natural Markov chain called Langevin diffusion mixes in polynomial time. However, all log-concave distributions are uni-modal, while in practice it is very common for the distribution of interest to have multiple modes. In this case, Langevin diffusion suffers from torpid mixing. We address this problem by combining Langevin diffusion with simulated tempering. The result is a Markov chain that mixes more rapidly by transitioning between different temperatures of the distribution. We analyze this Markov chain for a mixture of (strongly) log-concave distributions of the same shape. In particular, our technique applies to the canonical multi-modal distribution: a mixture of gaussians (of equal variance). Our algorithm efficiently samples from these distributions given only access to the gradient of the log-pdf. For the analysis, we introduce novel techniques for proving spectral gaps based on decomposing the action of the generator of the diffusion. Previous approaches rely on decomposing the state space as a partition of sets, while our approach can be thought of as decomposing the stationary measure as a mixture of distributions (a "soft partition"). Additional materials for the paper can be found at http://tiny.cc/glr17. The proof and results have been improved and generalized from the precursor at www.arxiv.org/abs/1710.02736.

artificial intelligence, machine learning, markov chain, (16 more...)

1812.00793

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Early Stratification of Patients at Risk for Postoperative Complications after Elective Colectomy

Wang, Wen, Padman, Rema, Shah, Nirav

Stratifying patients at risk for postoperative complications may facilitate timely and accurate workups and reduce the burden of adverse events on patients and the health system. Currently, a widely-used surgical risk calculator created by the American College of Surgeons, NSQIP, uses 21 preoperative covariates to assess risk of postoperative complications, but lacks dynamic, real-time capabilities to accommodate postoperative information. We propose a new Hidden Markov Model sequence classifier for analyzing patients' postoperative temperature sequences that incorporates their time-invariant characteristics in both transition probability and initial state probability in order to develop a postoperative "real-time" complication detector. Data from elective Colectomy surgery indicate that our method has improved classification performance compared to 8 other machine learning classifiers when using the full temperature sequence associated with the patients' length of stay. Additionally, within 44 hours after surgery, the performance of the model is close to that of full-length temperature sequence.

artificial intelligence, machine learning, sequence, (18 more...)

1811.12227

Country: North America > United States (0.15)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.96)

Bhaskara, Aditya, Chen, Aidao, Perreault, Aidan, Vijayaraghavan, Aravindan

Smoothed Analysis in Unsupervised Learning via Decoupling

Smoothed analysis is a powerful paradigm in overcoming worst-case intractability in unsupervised learning and high-dimensional data analysis. While polynomial time smoothed analysis guarantees have been obtained for worst-case intractable problems like tensor decompositions and learning mixtures of Gaussians, such guarantees have been hard to obtain for several other important problems in unsupervised learning. A core technical challenge is obtaining lower bounds on the least singular value for random matrix ensembles with dependent entries, that are given by low-degree polynomials of a few base underlying random variables. In this work, we address this challenge by obtaining high-confidence lower bounds on the least singular value of new classes of structured random matrix ensembles of the above kind. We then use these bounds to obtain polynomial time smoothed analysis guarantees for the following three important problems in unsupervised learning: 1. Robust subspace recovery, when the fraction $\alpha$ of inliers in the d-dimensional subspace $T \subset \mathbb{R}^n$ is at least $\alpha > (d/n)^\ell$ for any constant integer $\ell>0$. This contrasts with the known worst-case intractability when $\alpha< d/n$, and the previous smoothed analysis result which needed $\alpha > d/n$ (Hardt and Moitra, 2013). 2. Higher order tensor decompositions, where we generalize the so-called FOOBI algorithm of Cardoso to find order-$\ell$ rank-one tensors in a subspace. This allows us to obtain polynomially robust decomposition algorithms for $2\ell$'th order tensors with rank $O(n^{\ell})$. 3. Learning overcomplete hidden markov models, where the size of the state space is any polynomial in the dimension of the observations. This gives the first polynomial time guarantees for learning overcomplete HMMs in a smoothed analysis model.

artificial intelligence, machine learning, matrix, (16 more...)

1811.12361

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Yokoyama, Yuuki, Katsumata, Tomu, Yasuda, Muneki

Restricted Boltzmann Machine with Multivalued Hidden Variables: a model suppressing over-fitting

Generalization is one of the most important issues in machine learning problems. In this paper, we consider the generalization in restricted Boltzmann machines. We propose a restricted Boltzmann machine with multivalued hidden variables, which is a simple extension of conventional restricted Boltzmann machines. We demonstrate that our model is better than the conventional one via numerical experiments: experiments for a contrastive divergence learning with artificial data and for a classification problem with MNIST.

artificial intelligence, machine learning, rbm, (19 more...)

1811.12587

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceNov-29-2018

Transition-based versus State-based Reward Functions for MDPs with Value-at-Risk

Ma, Shuai, Yu, Jia Yuan

In reinforcement learning, the reward function on current state and action is widely used. When the objective is about the expectation of the (discounted) total reward only, it works perfectly. However, if the objective involves the total reward distribution, the result will be wrong. This paper studies Value-at-Risk (VaR) problems in short- and long-horizon Markov decision processes (MDPs) with two reward functions, which share the same expectations. Firstly we show that with VaR objective, when the real reward function is transition-based (with respect to action and both current and next states), the simplified (state-based, with respect to action and current state only) reward function will change the VaR. Secondly, for long-horizon MDPs, we estimate the VaR function with the aid of spectral theory and the central limit theorem. Thirdly, since the estimation method is for a Markov reward process with the reward function on current state only, we present a transformation algorithm for the Markov reward process with the reward function on current and next states, in order to estimate the VaR function with an intact total reward distribution.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1612.02088

Country: North America > Canada (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Hoffman, Matthew D., Johnson, Matthew J., Tran, Dustin

Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language

arXiv.org Machine LearningNov-28-2018

Deriving conditional and marginal distributions using conjugacy relationships can be time consuming and error prone. In this paper, we propose a strategy for automating such derivations. Unlike previous systems which focus on relationships between pairs of random variables, our system (which we call Autoconj) operates directly on Python functions that compute log-joint distribution functions. Autoconj provides support for conjugacy-exploiting algorithms in any Python-embedded PPL. This paves the way for accelerating development of novel inference algorithms and structure-exploiting modeling strategies.

artificial intelligence, machine learning, programming language, (20 more...)

1811.11926

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Software > Programming Languages (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Roy, Arghyadip, Borkar, Vivek, Karandikar, Abhay, Chaporkar, Prasanna

A Structure-aware Online Learning Algorithm for Markov Decision Processes

arXiv.org Machine LearningNov-28-2018

To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider an infinite-horizon average reward MDP problem and prove the optimality of the threshold policy under certain conditions. Traditional RL techniques do not exploit the threshold nature of optimal policy while learning. In this paper, we propose a new RL algorithm which utilizes the known threshold structure of the optimal policy while learning by reducing the feasible policy space. We establish that the proposed algorithm converges to the optimal policy. It provides a significant improvement in convergence speed and computational and storage complexity over traditional RL algorithms. The proposed technique can be applied to a wide variety of optimization problems that include energy efficient data transmission and management of queues. We exhibit the improvement in convergence speed of the proposed algorithm over other RL algorithms through simulations.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1811.11646

Country: Europe (0.15)

Genre: Research Report (0.64)

Industry:

Energy (0.46)
Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Vlasov, Vladimir, Drissner-Schmid, Akela, Nichol, Alan

Few-Shot Generalization Across Dialogue Tasks

arXiv.org Artificial IntelligenceNov-28-2018

Machine-learning based dialogue managers are able to learn complex behaviors in order to complete a task, but it is not straightforward to extend their capabilities to new domains. We investigate different policies' ability to handle uncooperative user behavior, and how well expertise in completing one task (such as restaurant reservations) can be reapplied when learning a new one (e.g. booking a hotel). We introduce the Recurrent Embedding Dialogue Policy (REDP), which embeds system actions and dialogue states in the same vector space. REDP contains a memory component and attention mechanism based on a modified Neural Turing Machine, and significantly outperforms a baseline LSTM classifier on this task. We also show that both our architecture and baseline solve the bAbI dialogue task, achieving 100% test accuracy.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

1811.11707

Country: North America (0.46)

Genre: Research Report (0.50)

Industry: Consumer Products & Services > Restaurants (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)