Goto

Collaborating Authors

 spoken dialogue system


EmoNews: A Spoken Dialogue System for Expressive News Conversations

arXiv.org Artificial Intelligence

We develop a task-oriented spoken dialogue system (SDS) that regulates emotional speech based on contextual cues to enable more empathetic news conversations. Despite advancements in emotional text-to-speech (TTS) techniques, task-oriented emotional SDSs remain underexplored due to the compartmentalized nature of SDS and emotional TTS research, as well as the lack of standardized evaluation metrics for social goals. We address these challenges by developing an emotional SDS for news conversations that utilizes a large language model (LLM)-based sentiment analyzer to identify appropriate emotions and PromptTTS to synthesize context-appropriate emotional speech. We also propose subjective evaluation scale for emotional SDSs and judge the emotion regulation performance of the proposed and baseline systems. Experiments showed that our emotional SDS outperformed a baseline system in terms of the emotion regulation and engagement. These results suggest the critical role of speech emotion for more engaging conversations. All our source code is open-sourced at https://github.com/dhatchi711/espnet-emotional-news/tree/emo-sds/egs2/emo_news_sds/sds1


Reinforcement Learning for Spoken Dialogue Systems

Neural Information Processing Systems

Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application ofMDP algorithms to dialogue systems faces a number of severe technical challenges. We have built a general software tool (RLDS, for Reinforcement Learning for Dialogue Systems) based on the MDP framework, and have applied it to dialogue corpora gathered from two dialogue systems built at AT&T Labs. Our experiments demonstrate that RLDS holds promise as a tool for "browsing" and understanding correlations in complex, temporally dependent dialogue corpora.


Interactivism in Spoken Dialogue Systems

arXiv.org Artificial Intelligence

The interactivism model introduces a dynamic approach to language, communication and cognition. In this work, we explore this fundamental theory in the context of dialogue modelling for spoken dialogue systems (SDS). To extend such a theoretical framework, we present a set of design principles which adhere to central psycholinguistic and communication theories to achieve interactivism in SDS. From these, key ideas are linked to constitute the basis of our proposed design principles.



Neural User Simulation for Corpus-based Policy Optimisation for Spoken Dialogue Systems

arXiv.org Artificial Intelligence

User Simulators are one of the major tools that enable offline training of task-oriented dialogue systems. For this task the Agenda-Based User Simulator (ABUS) is often used. The ABUS is based on hand-crafted rules and its output is in semantic form. Issues arise from both properties such as limited diversity and the inability to interface a text-level belief tracker. This paper introduces the Neural User Simulator (NUS) whose behaviour is learned from a corpus and which generates natural language, hence needing a less labelled dataset than simulators generating a semantic output. In comparison to much of the past work on this topic, which evaluates user simulators on corpus-based metrics, we use the NUS to train the policy of a reinforcement learning based Spoken Dialogue System. The NUS is compared to the ABUS by evaluating the policies that were trained using the simulators. Cross-model evaluation is performed i.e. training on one simulator and testing on the other. Furthermore, the trained policies are tested on real users. In both evaluation tasks the NUS outperformed the ABUS.


Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning

arXiv.org Machine Learning

Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-objective reinforcement learning to significantly reduce the number of training dialogues required. We apply our proposed method to find optimized component weights for six domains and compare them to a default baseline.


Toward Habitable Assistance from Spoken Dialogue Systems

AAAI Conferences

Spoken dialogue is increasingly central to systems that assist people. As the tasks that people and machines speak about together become more complex, however, usersโ€™ dissatisfaction with those systems is an important concern. This paper presents a novel approach to learning for spoken dialogue systems. It describes embedded wizardry, a methodology for learning from skilled people, and applies it to a library whose patrons order books by telephone. To address the challenges inherent in this application, we introduce RFW+, a domain-independent, feature-selection method that considers feature categories. Models learned with RFW+ on embedded-wizard data improve the performance of a traditional spoken dialogue system.


Automatically Training a Problematic Dialogue Predictor for a Spoken Dialogue System

Journal of Artificial Intelligence Research

Spoken dialogue systems promise efficient and natural access to a large variety of information sources and services from any phone. However, current spoken dialogue systems are deficient in their strategies for preventing, identifying and repairing problems that arise in the conversation. This paper reports results on automatically training a Problematic Dialogue Predictor to predict problematic human-computer dialogues using a corpus of 4692 dialogues collected with the 'How May I Help You' (SM) spoken dialogue system. The Problematic Dialogue Predictor can be immediately applied to the system's decision of whether to transfer the call to a human customer care agent, or be used as a cue to the system's dialogue manager to modify its behavior to repair problems, and even perhaps, to prevent them. We show that a Problematic Dialogue Predictor using automatically-obtainable features from the first two exchanges in the dialogue can predict problematic dialogues 13.2% more accurately than the baseline.


Reinforcement Learning for Spoken Dialogue Systems

Neural Information Processing Systems

Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application ofMDP algorithms to dialogue systems faces a number of severe technical challenges. We have built a general software tool (RLDS, for Reinforcement Learning for Dialogue Systems) based on the MDP framework, and have applied it to dialogue corpora gathered from two dialogue systems built at AT&T Labs. Our experiments demonstrate that RLDS holds promise as a tool for "browsing" and understanding correlations in complex, temporally dependent dialogue corpora.


Reinforcement Learning for Spoken Dialogue Systems

Neural Information Processing Systems

Recently, a number of authors have proposed treating dialogue systems as Markov decision processes (MDPs). However, the practical application ofMDP algorithms to dialogue systems faces a number of severe technical challenges. We have built a general software tool (RLDS, for Reinforcement Learning for Dialogue Systems) based on the MDP framework, and have applied it to dialogue corpora gathered from two dialogue systems built at AT&T Labs. Our experiments demonstrate that RLDS holds promise as a tool for "browsing" and understanding correlations in complex, temporally dependent dialogue corpora.