The strategies for interactive characters to select appropriate dialogues remain as an open issue in related research areas. In this paper we propose an approach based on reinforcement learning to learn the strategy of interrogation dialogue from one virtual agent toward another. The emotion variation of the suspect agent is modeled with a hazard function, and the detective agent must learn its interrogation strategies based on the emotion state of the suspect agent. The reinforcement learning reward schemes are evaluated to choose the proper reward in the dialogue.
This article provides the first survey of computational models of emotion in reinforcement learning (RL) agents. The survey focuses on agent/robot emotions, and mostly ignores human user emotions. Emotions are recognized as functional in decision-making by influencing motivation and action selection. Therefore, computational emotion models are usually grounded in the agent's decision making architecture, of which RL is an important subclass. Studying emotions in RL-based agents is useful for three research fields. For machine learning (ML) researchers, emotion models may improve learning efficiency. For the interactive ML and human-robot interaction (HRI) community, emotions can communicate state and enhance user investment. Lastly, it allows affective modelling (AM) researchers to investigate their emotion theories in a successful AI agent class. This survey provides background on emotion theory and RL. It systematically addresses 1) from what underlying dimensions (e.g., homeostasis, appraisal) emotions can be derived and how these can be modelled in RL-agents, 2) what types of emotions have been derived from these dimensions, and 3) how these emotions may either influence the learning efficiency of the agent or be useful as social signals. We also systematically compare evaluation criteria, and draw connections to important RL sub-domains like (intrinsic) motivation and model-based RL. In short, this survey provides both a practical overview for engineers wanting to implement emotions in their RL agents, and identifies challenges and directions for future emotion-RL research.
Designing dialog policies for voice-enabled interfaces is a tailoring job that is most often left to natural language processing experts. This job is generally redone for every new dialog task because cross-domain transfer is not possible. For this reason, machine learning methods for dialog policy optimization have been investigated during the last 15 years. Especially, reinforcement learning (RL) is now part of the state of the art in this domain. Standard RL methods require to test more or less random changes in the policy on users to assess them as improvements or degradations. This is called on policy learning. Nevertheless, it can result in system behaviors that are not acceptable by users. Learning algorithms should ideally infer an optimal strategy by observing interactions generated by a non-optimal but acceptable strategy, that is learning off-policy. In this contribution, a sample-efficient, online and off-policy reinforcement learning algorithm is proposed to learn an optimal policy from few hundreds of dialogues generated with a very simple handcrafted policy.
Lubis, Nurul (Nara Institute of Science and Technology) | Sakti, Sakriani (Nara Institute of Science and Technology) | Yoshino, Koichiro (Nara Institute of Science and Technology) | Nakamura, Satoshi (Nara Institute of Science and Technology)
An emotionally-competent computer agent could be a valuable assistive technology in performing various affective tasks. For example caring for the elderly, low-cost ubiquitous chat therapy, and providing emotional support in general, by promoting a more positive emotional state through dialogue system interaction. However, despite the increase of interest in this task, existing works face a number of shortcomings: system scalability, restrictive modeling, and weak emphasis on maximizing user emotional experience. In this paper, we build a fully data driven chat-oriented dialogue system that can dynamically mimic affective human interactions by utilizing a neural network architecture. In particular, we propose a sequence-to-sequence response generator that considers the emotional context of the dialogue. An emotion encoder is trained jointly with the entire network to encode and maintain the emotional context throughout the dialogue. The encoded emotion information is then incorporated in the response generation process. We train the network with a dialogue corpus that contains positive-emotion eliciting responses, collected through crowd-sourcing. Objective evaluation shows that incorporation of emotion into the training process helps reduce the perplexity of the generated responses, even when a small dataset is used. Subsequent subjective evaluation shows that the proposed method produces responses that are more natural and likely to elicit a more positive emotion.
Developing intelligent persuasive conversational agents to change people's opinions and actions for social good is the frontier in advancing the ethical development of automated dialogue systems. To do so, the first step is to understand the intricate organization of strategic disclosures and appeals employed in human persuasion conversations. We designed an online persuasion task where one participant was asked to persuade the other to donate to a specific charity. We collected a large dataset with 1,017 dialogues and annotated emerging persuasion strategies from a subset. Based on the annotation, we built a baseline classifier with context information and sentence-level features to predict the 10 persuasion strategies used in the corpus. Furthermore, to develop an understanding of personalized persuasion processes, we analyzed the relationships between individuals' demographic and psychological backgrounds including personality, morality, value systems, and their willingness for donation. Then, we analyzed which types of persuasion strategies led to a greater amount of donation depending on the individuals' personal backgrounds. This work lays the ground for developing a personalized persuasive dialogue system.