Goto

Collaborating Authors

 Sekulić, Ivan


"Stupid robot, I want to speak to a human!" User Frustration Detection in Task-Oriented Dialog Systems

arXiv.org Artificial Intelligence

Detecting user frustration in modern-day task-oriented dialog (TOD) systems is imperative for maintaining overall user satisfaction, engagement, and retention. However, most recent research is focused on sentiment and emotion detection in academic settings, thus failing to fully encapsulate implications of real-world user data. To mitigate this gap, in this work, we focus on user frustration in a deployed TOD system, assessing the feasibility of out-of-the-box solutions for user frustration detection. Specifically, we compare the performance of our deployed keyword-based approach, open-source approaches to sentiment analysis, dialog breakdown detection methods, and emerging in-context learning LLM-based detection. Our analysis highlights the limitations of open-source methods for real-world frustration detection, while demonstrating the superior performance of the LLM-based approach, achieving a 16\% relative improvement in F1 score on an internal benchmark. Finally, we analyze advantages and limitations of our methods and provide an insight into user frustration detection task for industry practitioners.


Towards Self-Contained Answers: Entity-Based Answer Rewriting in Conversational Search

arXiv.org Artificial Intelligence

Conversational information-seeking (CIS) is an emerging paradigm for knowledge acquisition and exploratory search. Traditional web search interfaces enable easy exploration of entities, but this is limited in conversational settings due to the limited-bandwidth interface. This paper explore ways to rewrite answers in CIS, so that users can understand them without having to resort to external services or sources. Specifically, we focus on salient entities -- entities that are central to understanding the answer. As our first contribution, we create a dataset of conversations annotated with entities for saliency. Our analysis of the collected data reveals that the majority of answers contain salient entities. As our second contribution, we propose two answer rewriting strategies aimed at improving the overall user experience in CIS. One approach expands answers with inline definitions of salient entities, making the answer self-contained. The other approach complements answers with follow-up questions, offering users the possibility to learn more about specific entities. Results of a crowdsourcing-based study indicate that rewritten answers are clearly preferred over the original ones. We also find that inline definitions tend to be favored over follow-up questions, but this choice is highly subjective, thereby providing a promising future direction for personalization.


Reliable LLM-based User Simulator for Task-Oriented Dialogue Systems

arXiv.org Artificial Intelligence

In this paper, we introduce DAUS, a generative The field of dialogue systems has seen a notable user simulator for TOD systems. As depicted in surge in the utilization of user simulation approaches, Figure 1, once initialized with the user goal description, primarily for the evaluation and enhancement DAUS engages with the system across of conversational search systems (Owoicho multiple turns, providing information to fulfill the et al., 2023) and task-oriented dialogue (TOD) systems user's objectives. Our aim is to minimize the commonly (Terragni et al., 2023). User simulation plays observed user simulator hallucinations and a pivotal role in replicating the nuanced interactions incorrect responses (right-hand side of Figure 1), of real users with these systems, enabling a with an ultimate objective of enabling detection wide range of applications such as synthetic data of common errors in TOD systems (left-hand side augmentation, error detection, and evaluation (Wan of Figure 1). Our approach is straightforward yet et al., 2022; Sekulić et al., 2022; Li et al., 2022; effective: we build upon the foundation of LLMbased Balog and Zhai, 2023; Ji et al., 2022).


Estimating the Usefulness of Clarifying Questions and Answers for Conversational Search

arXiv.org Artificial Intelligence

While the body of research directed towards constructing and generating clarifying questions in mixed-initiative conversational search systems is vast, research aimed at processing and comprehending users' answers to such questions is scarce. To this end, we present a simple yet effective method for processing answers to clarifying questions, moving away from previous work that simply appends answers to the original query and thus potentially degrades retrieval performance. Specifically, we propose a classifier for assessing usefulness of the prompted clarifying question and an answer given by the user. Useful questions or answers are further appended to the conversation history and passed to a transformer-based query rewriting module. Results demonstrate significant improvements over strong non-mixed-initiative baselines. Furthermore, the proposed approach mitigates the performance drops when non useful questions and answers are utilized.