Goto

Collaborating Authors

 Personal Assistant Systems


An Ontological Model of User Preferences

arXiv.org Artificial Intelligence

The notion of preferences plays an important role in many disciplines including service robotics which is concerned with scenarios in which robots interact with humans. These interactions can be favored by robots taking human preferences into account. This raises the issue of how preferences should be represented to support such preference-aware decision making. Several formal accounts for a notion of preferences exist. However, these approaches fall short on defining the nature and structure of the options that a robot has in a given situation. In this work, we thus investigate a formal model of preferences where options are non-atomic entities that are defined by the complex situations they bring about.


A Spectral Approach to Item Response Theory

arXiv.org Machine Learning

The Rasch model is one of the most fundamental models in \emph{item response theory} and has wide-ranging applications from education testing to recommendation systems. In a universe with $n$ users and $m$ items, the Rasch model assumes that the binary response $X_{li} \in \{0,1\}$ of a user $l$ with parameter $\theta^*_l$ to an item $i$ with parameter $\beta^*_i$ (e.g., a user likes a movie, a student correctly solves a problem) is distributed as $\Pr(X_{li}=1) = 1/(1 + \exp{-(\theta^*_l - \beta^*_i)})$. In this paper, we propose a \emph{new item estimation} algorithm for this celebrated model (i.e., to estimate $\beta^*$). The core of our algorithm is the computation of the stationary distribution of a Markov chain defined on an item-item graph. We complement our algorithmic contributions with finite-sample error guarantees, the first of their kind in the literature, showing that our algorithm is consistent and enjoys favorable optimality properties. We discuss practical modifications to accelerate and robustify the algorithm that practitioners can adopt. Experiments on synthetic and real-life datasets, ranging from small education testing datasets to large recommendation systems datasets show that our algorithm is scalable, accurate, and competitive with the most commonly used methods in the literature.


Expanding the Set of Pragmatic Considerations in Conversational AI

arXiv.org Artificial Intelligence

Despite considerable performance improvements, current conversational AI systems often fail to meet user expectations. We discuss several pragmatic limitations of current conversational AI systems. We illustrate pragmatic limitations with examples that are syntactically appropriate, but have clear pragmatic deficiencies. We label our complaints as "Turing Test Triggers" (TTTs) as they indicate where current conversational AI systems fall short compared to human behavior. We develop a taxonomy of pragmatic considerations intended to identify what pragmatic competencies a conversational AI system requires and discuss implications for the design and evaluation of conversational AI systems.


Towards a Unified Conversational Recommendation System: Multi-task Learning via Contextualized Knowledge Distillation

arXiv.org Artificial Intelligence

In Conversational Recommendation System (CRS), an agent is asked to recommend a set of items to users within natural language conversations. To address the need for both conversational capability and personalized recommendations, prior works have utilized separate recommendation and dialogue modules. However, such approach inevitably results in a discrepancy between recommendation results and generated responses. To bridge the gap, we propose a multi-task learning for a unified CRS, where a single model jointly learns both tasks via Contextualized Knowledge Distillation (ConKD). We introduce two versions of ConKD: hard gate and soft gate. The former selectively gates between two task-specific teachers, while the latter integrates knowledge from both teachers. Our gates are computed on-the-fly in a context-specific manner, facilitating flexible integration of relevant knowledge. Extensive experiments demonstrate that our single model significantly improves recommendation performance while enhancing fluency, and achieves comparable results in terms of diversity.


ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing

arXiv.org Artificial Intelligence

Despite a surge collection of XAI methods, users still struggle to obtain required AI explanations. Previous research suggests chatbots as dynamic solutions, but the effective design of conversational XAI agents for practical human needs remains under-explored. This paper focuses on Conversational XAI for AI-assisted scientific writing tasks. Drawing from human linguistic theories and formative studies, we identify four design rationales: "multifaceted", "controllability", "mix-initiative", "context-aware drill-down". We incorporate them into an interactive prototype, ConvXAI, which facilitates heterogeneous AI explanations for scientific writing through dialogue. In two studies with 21 users, ConvXAI outperforms a GUI-based baseline on improving human-perceived understanding and writing improvement. The paper further discusses the practical human usage patterns in interacting with ConvXAI for scientific co-writing.


Towards Understanding Sycophancy in Language Models

arXiv.org Machine Learning

Human feedback is commonly utilized to finetune AI assistants. But human feedback may also encourage model responses that match user beliefs over truthful ones, a behaviour known as sycophancy. We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback, and the potential role of human preference judgments in such behavior. We first demonstrate that five state-of-the-art AI assistants consistently exhibit sycophancy across four varied free-form text-generation tasks. To understand if human preferences drive this broadly observed behavior, we analyze existing human preference data. We find that when a response matches a user's views, it is more likely to be preferred. Moreover, both humans and preference models (PMs) prefer convincingly-written sycophantic responses over correct ones a non-negligible fraction of the time. Optimizing model outputs against PMs also sometimes sacrifices truthfulness in favor of sycophancy. Overall, our results indicate that sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.


Amazon's Echo Studio and Echo Sub bundle is 24 percent off right now

Engadget

If you've been looking for a high-end smart speaker with a subwoofer, Amazon has an interesting deal in its Echo lineup right now. It's got a bundle deal with the high-end Echo Studio and the Echo sub, with both on sale for $248 instead of $330, saving you $82 (24 percent). Those are a killer combo for home theater and more, particularly if you get a second Echo Studio to make a stereo pair. Amazon's Echo Studio and Echo Sub are on sale in a bundle if you want the ultimate smart speaker. The Echo Studio appears in our guide to the best smart speakers as an alternative to the Sonos Era 100 for those who already rely on Alexa.


Salespeople vs SalesBot: Exploring the Role of Educational Value in Conversational Recommender Systems

arXiv.org Artificial Intelligence

Making big purchases requires consumers to research or consult a salesperson to gain domain expertise. However, existing conversational recommender systems (CRS) often overlook users' lack of background knowledge, focusing solely on gathering preferences. In this work, we define a new problem space for conversational agents that aim to provide both product recommendations and educational value through mixed-type mixed-initiative dialog. We introduce SalesOps, a framework that facilitates the simulation and evaluation of such systems by leveraging recent advancements in large language models (LLMs). We build SalesBot and ShopperBot, a pair of LLM-powered agents that can simulate either side of the framework. A comprehensive human study compares SalesBot against professional salespeople, revealing that although SalesBot approaches professional performance in terms of fluency and informativeness, it lags behind in recommendation quality. We emphasize the distinct limitations both face in providing truthful information, highlighting the challenges of ensuring faithfulness in the CRS context. We release our code and make all data available.


Comparing Photorealistic and Animated Embodied Conversational Agents in Serious Games: An Empirical Study on User Experience

arXiv.org Artificial Intelligence

Embodied conversational agents (ECAs) are paradigms of conversational user interfaces in the form of embodied characters. While ECAs offer various manipulable features, this paper focuses on a study conducted to explore two distinct levels of presentation realism. The two agent versions are photorealistic and animated. The study aims to provide insights and design suggestions for speech-enabled ECAs within serious game environments. A within-subjects, two-by-two factorial design was employed for this research with a cohort of 36 participants balanced for gender. The results showed that both the photorealistic and the animated versions were perceived as highly usable, with overall mean scores of 5.76 and 5.71, respectively. However, 69.4 per cent of the participants stated they preferred the photorealistic version, 25 per cent stated they preferred the animated version and 5.6 per cent had no stated preference. The photorealistic agents were perceived as more realistic and human-like, while the animated characters made the task feel more like a game. Even though the agents' realism had no significant effect on usability, it positively influenced participants' perceptions of the agent. This research aims to lay the groundwork for future studies on ECA realism's impact in serious games across diverse contexts.


Multi-grained Hypergraph Interest Modeling for Conversational Recommendation

arXiv.org Artificial Intelligence

Conversational recommender system (CRS) interacts with users through multi-turn dialogues in natural language, which aims to provide high-quality recommendations for user's instant information need. Although great efforts have been made to develop effective CRS, most of them still focus on the contextual information from the current dialogue, usually suffering from the data scarcity issue. Therefore, we consider leveraging historical dialogue data to enrich the limited contexts of the current dialogue session. In this paper, we propose a novel multi-grained hypergraph interest modeling approach to capture user interest beneath intricate historical data from different perspectives. As the core idea, we employ hypergraph to represent complicated semantic relations underlying historical dialogues. In our approach, we first employ the hypergraph structure to model users' historical dialogue sessions and form a session-based hypergraph, which captures coarse-grained, session-level relations. Second, to alleviate the issue of data scarcity, we use an external knowledge graph and construct a knowledge-based hypergraph considering fine-grained, entity-level semantics. We further conduct multi-grained hypergraph convolution on the two kinds of hypergraphs, and utilize the enhanced representations to develop interest-aware CRS. Extensive experiments on two benchmarks ReDial and TG-ReDial validate the effectiveness of our approach on both recommendation and conversation tasks. Code is available at: https://github.com/RUCAIBox/MHIM.