AITopics

1911.03853

Country: Asia > India > West Bengal > Kharagpur (0.05)

Genre: Research Report (0.50)

Industry: Government > Voting & Elections (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Fazel-Zarandi, Maryam, Wang, Longshaokan, Tiwari, Aditya, Matsoukas, Spyros

Investigation of Error Simulation Techniques for Learning Dialog Policies for Conversational Error Recovery

arXiv.org Artificial IntelligenceNov-8-2019

Training dialog policies for speech-based virtual assistants requires a plethora of conversational data. The data collection phase is often expensive and time consuming due to human involvement. To address this issue, a common solution is to build user simulators for data generation. For the successful deployment of the trained policies into real world domains, it is vital that the user simulator mimics realistic conditions. In particular, speech-based assistants are heavily affected by automatic speech recognition and language understanding errors, hence the user simulator should be able to simulate similar errors. In this paper, we review the existing error simulation methods that induce errors at audio, phoneme, text, or semantic level; and conduct detailed comparisons between the audio-level and text-level methods. In the process, we improve the existing text-level method by introducing confidence score prediction and out-of-vocabulary word mapping. We also explore the impact of audio-level and text-level methods on learning a simple clarification dialog policy to recover from errors to provide insight on future improvement for both approaches.

asr output, hypothesis, text-level method, (15 more...)

1911.03378

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada (0.04)
North America > United States > North Carolina (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Gerum, Richard C., Erpenbeck, André, Krauss, Patrick, Schilling, Achim

Sparsity through evolutionary pruning prevents neuronal networks from overfitting

arXiv.org Artificial IntelligenceNov-7-2019

Modern Machine learning techniques take advantage of the exponentially rising calculation power in new generation processor units. Thus, the number of parameters which are trained to resolve complex tasks was highly increased over the last decades. However, still the networks fail - in contrast to our brain - to develop general intelligence in the sense of being able to solve several complex tasks with only one network architecture. This could be the case because the brain is not a randomly initialized neural network, which has to be trained by simply investing a lot of calculation power, but has from birth some fixed hierarchical structure. To make progress in decoding the structural basis of biological neural networks we here chose a bottom-up approach, where we evolutionarily trained small neural networks in performing a maze task. This simple maze task requires dynamical decision making with delayed rewards. We were able to show that during the evolutionary optimization random severance of connections lead to better generalization performance of the networks compared to fully connected networks. We conclude that sparsity is a central property of neural networks and should be considered for modern Machine learning approaches.

evolutionary pruning prevent neuronal network, fitness, neural network, (10 more...)

1911.10988

Country:

North America > United States (0.28)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Ponnusamy, Pragaash, Ghias, Alireza Roshan, Guo, Chenlei, Sarikaya, Ruhi

Feedback-Based Self-Learning in Large-Scale Conversational AI Agents

arXiv.org Artificial IntelligenceNov-6-2019

Today, most large-scale conversational AI agents (e.g. Alexa, Siri, or Google Assistant) are built using manually annotated data to train the different components of the system. Typically, the accuracy of the ML models in these components are improved by manually transcribing and annotating data. As the scope of these systems increase to cover more scenarios and domains, manual annotation to improve the accuracy of these components becomes prohibitively costly and time consuming. In this paper, we propose a system that leverages user-system interaction feedback signals to automate learning without any manual annotation. Users here tend to modify a previous query in hopes of fixing an error in the previous turn to get the right results. These reformulations, which are often preceded by defective experiences caused by errors in ASR, NLU, ER or the application. In some cases, users may not properly formulate their requests (e.g. providing partial title of a song), but gleaning across a wider pool of users and sessions reveals the underlying recurrent patterns. Our proposed self-learning system automatically detects the errors, generate reformulations and deploys fixes to the runtime system to correct different types of errors occurring in different components of the system. In particular, we propose leveraging an absorbing Markov Chain model as a collaborative filtering mechanism in a novel attempt to mine these patterns. We show that our approach is highly scalable, and able to learn reformulations that reduce Alexa-user errors by pooling anonymized data across millions of customers. The proposed self-learning system achieves a win/loss ratio of 11.8 and effectively reduces the defect rate by more than 30% on utterance level reformulations in our production A/B tests. To the best of our knowledge, this is the first self-learning large-scale conversational AI system in production.

reformulation, rewrite, utterance, (15 more...)

1911.02557

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Ness, Robert Osazuwa, Paneri, Kaushal, Vitek, Olga

Integrating Markov processes with structural causal modeling enables counterfactual inference in complex systems

arXiv.org Machine LearningNov-5-2019

This manuscript contributes a general and practical framework for casting a Markov process model of a system at equilibrium as a structural causal model, and carrying out counterfactual inference. Markov processes mathematically describe the mechanisms in the system, and predict the system's equilibrium behavior upon intervention, but do not support counterfactual inference. In contrast, structural causal models support counterfactual inference, but do not identify the mechanisms. This manuscript leverages the benefits of both approaches. We define the structural causal models in terms of the parameters and the equilibrium dynamics of the Markov process models, and counterfactual inference flows from these settings. The proposed approach alleviates the identifiability drawback of the structural causal models, in that the counterfactual inference is consistent with the counterfactual trajectories simulated from the Markov process model. We showcase the benefits of this framework in case studies of complex biomolecular systems with nonlinear dynamics. We illustrate that, in presence of Markov process model misspecification, counterfactual inference leverages prior data, and therefore estimates the outcome of an intervention more accurately than a direct simulation.

counterfactual inference, intervention, markov process model, (14 more...)

1911.02175

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

#artificialintelligenceNov-4-2019, 17:06:34 GMT

How Smart Are AI Chips, Really?

The best part about the term "Artificial Intelligence" is that nobody can really tell you what it exactly means. The main reason for this stems from the term "intelligence", with definitions ranging from the ability to practice logical reasoning to the ability to perform cognitive tasks or dream up symphonies. When it comes to human intelligence, properties such as self-awareness, complex cognitive feats, and the ability to plan and motivate oneself are generally considered to be defining features. But frankly, what is and isn't "intelligence" is open to debate. What isn't open to debate is that AI is a marketing goldmine.

intelligence, neural network, processor, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

#artificialintelligenceNov-4-2019, 17:06:34 GMT

How Smart Are AI Chips, Really?

intelligence, neural network, processor, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

Chowdhury, Sayak Ray, Gopalan, Aditya

On Online Learning in Kernelized Markov Decision Processes

arXiv.org Machine LearningNov-4-2019

Abstract-- We develop algorithms with low regret for learning episodic Markov decision processes based on kernel approximation techniques. The algorithms are based on both the Upper Confidence Bound (UCB) as well as Posterior or Thompson Sampling (PSRL) philosophies, and work in the general setting of continuous state and action spaces when the true unknown transition dynamics are assumed to have smoothness induced by an appropriate Reproducing Kernel Hilbert Space (RKHS). I. INTRODUCTION The goal of reinforcement learning (RL) is to learn optimal behavior by repeated interaction with an unknown environment, usually modeled as a Markov Decision Process (MDP). Performance is typically measured by the amount of interaction, in terms of episodes or rounds, needed to arriv e at an optimal (or near-optimal) policy; this is also known as the sample complexity of RL [1]. The sample complexity objective encourages efficient exploration across states a nd actions, but, at the same time, is indifferent to the reward earned during the learning phase.

kernel, neural information processing system, probability, (13 more...)

1911.01871

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Titsias, Michalis K., Dellaportas, Petros

Gradient-based Adaptive Markov Chain Monte Carlo

arXiv.org Machine LearningNov-4-2019

We introduce a gradient-based learning method to automatically adapt Markov chain Monte Carlo (MCMC) proposal distributions to intractable targets. We define a maximum entropy regularised objective function, referred to as generalised speed measure, which can be robustly optimised over the parameters of the proposal distribution by applying stochastic gradient optimisation. An advantage of our method compared to traditional adaptive MCMC methods is that the adaptation occurs even when candidate state values are rejected. This is a highly desirable property of any adaptation strategy because the adaptation starts in early iterations even if the initial proposal distribution is far from optimum. We apply the framework for learning multivariate random walk Metropolis and Metropolis-adjusted Langevin proposals with full covariance matrices, and provide empirical evidence that our method can outperform other MCMC algorithms, including Hamiltonian Monte Carlo schemes.

algorithm, iteration, proposal distribution, (14 more...)

1911.01373

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Hudson County > Secaucus (0.04)
(4 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Keramati, Ramtin, Dann, Christoph, Tamkin, Alex, Brunskill, Emma

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

arXiv.org Artificial IntelligenceNov-4-2019

Being Optimistic to Be Conservative: Quickly Learning a CV aR Policy Ramtin Keramati 1, Christoph Dann 2, Alex T amkin 3, Emma Brunskill 3 1 Institute of Computational and Mathematical Engineering (ICME), Stanford University, California, USA 2 Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA 3 Department of Computer Science, Stanford University, California, USA {keramati,atamkin,ebrun } @cs.stanford.edu Abstract While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CV aR) are more suitable for many high-stakes applications. However, relatively little is known about how to explore to quickly learn policies with good CV aR. In this paper, we present the first algorithm for sample-efficient learning of CV aR-optimal policies in Markov decision processes based on the optimism in the face of uncertainty principle. This method relies on a novel optimistic version of the distributional Bellman operator that moves probability mass from the lower to the upper tail of the return distribution. We prove asymptotic convergence and optimism of this operator for the tabular policy evaluation case. We further demonstrate that our algorithm finds CV aR-optimal policies substantially faster than existing baselines in several simulated environments with discrete and continuous state spaces. Introduction A key goal in reinforcement learning (RL) is to quickly learn to make good decisions by interacting with an environment. In most cases the quality of the decision policy is evaluated with respect to its expected (discounted) sum of rewards. However, in many interesting cases, it is important to consider the full distributions over the potential sum of rewards, and the desired objective may be a risk-sensitive measure of this distribution. For example, a patient undergoing a surgery for a knee replacement will (hopefully) only experience that procedure once or twice, and may will be interested in the distribution of potential results for a single procedure, rather than what may happen on average if he or she were to undertake that procedure hundreds of time. Finance and (machine) control are other cases where interest in risk-sensitive outcomes are common. A popular risk-sensitive measure of a distribution of outcomes is the Conditional V alue at Risk (CV aR) (Artzner et al. 1999). Intuitively, CV aR is the expected reward in the worst α -fraction of outcomes, and has seen extensive use in financial portfolio optimization (Zhu and Fukushima 2009), often under the name "expected shortfall".

algorithm, exploration, operator, (16 more...)

1911.01546

Country:

Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.24)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.24)
North America > United States > California > Santa Clara County > Palo Alto (0.24)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)