Goto

Collaborating Authors

 conversational


3 New Tricks to Try With Google Gemini Live After Its Latest Major Upgrade

WIRED

Google's AI is now even smarter, and more versatile. Gemini Live is the more conversational, natural language way of interacting with the Google Gemini AI bot using your voice. The idea is you chat with it like you would chat with a friend, interruptions and all, even if the actual answers are the same as you'd get from typing your queries into Gemini as normal. Now, about a year and a half after its debut, Gemini Live has been given what Google is describing as its "biggest update ever." The update makes the Gemini Live mode even more natural and even more conversational than before, with a better understanding of tone, nuance, pronunciation, and rhythm.


AI-Assisted Conversational Interviewing: Effects on Data Quality and Respondent Experience

Barari, Soubhik, Angbazo, Jarret, Wang, Natalie, Christian, Leah M., Dean, Elizabeth, Slowinski, Zoe, Sepulvado, Brandon

arXiv.org Artificial Intelligence

Standardized surveys scale efficiently but sacrifice depth, while conversational interviews improve response quality at the cost of scalability and consistency. This study bridges the gap between these methods by introdu cing a framework for AI - assisted conversational interviewing. To evaluate this framework, we conducted a web survey experiment where 1,800 p articipants were randomly assigned to AI ' chatbots ' which use large language models (LLMs) to dynamically probe respondents for elaboration and interactively code open - ended responses to fixed questions developed by human researchers . We assessed the AI chatbot's performance in terms of coding accuracy, response quality, and respondent experience. Our findings reveal that AI chatbots perform moderately well in live coding even without survey - specific fine - tuning, despite slightly inflated false positive err ors due to respondent acquiescence bias. Open - ended responses were more detailed and informative, but this came at a slight cost to respondent experience. Our findings highlight the feasibility of using AI methods such as chatbots enhanced by LLMs to enhance open - ended data collection in web surveys. 2


BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation System

Raffel, Matthew, Agostinelli, Victor, Chen, Lizhong

arXiv.org Artificial Intelligence

This paper discusses the construction, fine-tuning, and deployment of BeaverTalk, a cascaded system for speech-to-text translation as part of the IWSLT 2025 simultaneous translation task. The system architecture employs a VAD segmenter for breaking a speech stream into segments, Whisper Large V2 for automatic speech recognition (ASR), and Gemma 3 12B for simultaneous translation. Regarding the simultaneous translation LLM, it is fine-tuned via low-rank adaptors (LoRAs) for a conversational prompting strategy that leverages a single prior-sentence memory bank from the source language as context. The cascaded system participated in the English$\rightarrow$German and English$\rightarrow$Chinese language directions for both the low and high latency regimes. In particular, on the English$\rightarrow$German task, the system achieves a BLEU of 24.64 and 27.83 at a StreamLAAL of 1837.86 and 3343.73, respectively. Then, on the English$\rightarrow$Chinese task, the system achieves a BLEU of 34.07 and 37.23 at a StreamLAAL of 2216.99 and 3521.35, respectively.


AI-Enabled Conversational Journaling for Advancing Parkinson's Disease Symptom Tracking

Rashik, Mashrur, Sweth, Shilpa, Agrawal, Nishtha, Kochar, Saiyyam, Smith, Kara M, Rajabiyazdi, Fateme, Setlur, Vidya, Mahyar, Narges, Sarvghad, Ali

arXiv.org Artificial Intelligence

Journaling plays a crucial role in managing chronic conditions by allowing patients to document symptoms and medication intake, providing essential data for long-term care. While valuable, traditional journaling methods often rely on static, self-directed entries, lacking interactive feedback and real-time guidance. This gap can result in incomplete or imprecise information, limiting its usefulness for effective treatment. To address this gap, we introduce PATRIKA, an AI-enabled prototype designed specifically for people with Parkinson's disease (PwPD). The system incorporates cooperative conversation principles, clinical interview simulations, and personalization to create a more effective and user-friendly journaling experience. Through two user studies with PwPD and iterative refinement of PATRIKA, we demonstrate conversational journaling's significant potential in patient engagement and collecting clinically valuable information. Our results showed that generating probing questions PATRIKA turned journaling into a bi-directional interaction. Additionally, we offer insights for designing journaling systems for healthcare and future directions for promoting sustained journaling.


Lla-VAP: LSTM Ensemble of Llama and VAP for Turn-Taking Prediction

Jeon, Hyunbae, Guintu, Frederic, Sahni, Rayvant

arXiv.org Artificial Intelligence

Turn-taking prediction is the task of anticipating when the speaker in a conversation will yield their turn to another speaker to begin speaking. This project expands on existing strategies for turn-taking prediction by employing a multi-modal ensemble approach that integrates large language models (LLMs) and voice activity projection (VAP) models. By combining the linguistic capabilities of LLMs with the temporal precision of VAP models, we aim to improve the accuracy and efficiency of identifying TRPs in both scripted and unscripted conversational scenarios. Our methods are evaluated on the In-Conversation Corpus (ICC) and Coached Conversational Preference Elicitation (CCPE) datasets, highlighting the strengths and limitations of current models while proposing a potentially more robust framework for enhanced prediction.


Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models

Li, Jiatao, Hu, Xinyu, Yin, Xunjian, Wan, Xiaojun

arXiv.org Artificial Intelligence

The integration of documents generated by LLMs themselves (Self-Docs) alongside retrieved documents has emerged as a promising strategy for retrieval-augmented generation systems. However, previous research primarily focuses on optimizing the use of Self-Docs, with their inherent properties remaining underexplored. To bridge this gap, we first investigate the overall effectiveness of Self-Docs, identifying key factors that shape their contribution to RAG performance (RQ1). Building on these insights, we develop a taxonomy grounded in Systemic Functional Linguistics to compare the influence of various Self-Docs categories (RQ2) and explore strategies for combining them with external sources (RQ3). Our findings reveal which types of Self-Docs are most beneficial and offer practical guidelines for leveraging them to achieve significant improvements in knowledge-intensive question answering tasks.


KULCQ: An Unsupervised Keyword-based Utterance Level Clustering Quality Metric

Guruprasad, Pranav, Mokhberian, Negar, Varghese, Nikhil, Khatri, Chandra, Kelkar, Amol

arXiv.org Artificial Intelligence

Intent discovery is crucial for both building new conversational agents and improving existing ones. While several approaches have been proposed for intent discovery, most rely on clustering to group similar utterances together. Traditional evaluation of these utterance clusters requires intent labels for each utterance, limiting scalability. Although some clustering quality metrics exist that do not require labeled data, they focus solely on cluster geometry while ignoring the linguistic nuances present in conversational transcripts. In this paper, we introduce Keyword-based Utterance Level Clustering Quality (KULCQ), an unsupervised metric that leverages keyword analysis to evaluate clustering quality. We demonstrate KULCQ's effectiveness by comparing it with existing unsupervised clustering metrics and validate its performance through comprehensive ablation studies. Our results show that KULCQ better captures semantic relationships in conversational data while maintaining consistency with geometric clustering principles.


Exploring Straightforward Conversational Red-Teaming

Kour, George, Zwerdling, Naama, Zalmanovici, Marcel, Anaby-Tavor, Ateret, Fandina, Ora Nova, Farchi, Eitan

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly used in business dialogue systems but they pose security and ethical risks. Multiturn conversations, where context influences the model's behavior, can be exploited to produce undesired responses. In this paper, we examine the effectiveness of utilizing off-theshelf LLMs in straightforward red-teaming approaches, where an attacker LLM aims to elicit undesired output from a target LLM, comparing both single-turn and conversational redteaming tactics. Our experiments offer insights into various usage strategies that significantly affect their performance as red teamers. They suggest that off-the-shelf models can act as effective red teamers and even adjust their attack strategy based on past attempts, although their effectiveness decreases with greater alignment. Figure 1: An example dialogue between a red-teaming Warning: This paper contains examples and model (red) and the target model (blue) in a conversational model-generated content that may be considered setting, with a judge LLM (grey) scoring the offensive.


What Mark Zuckerberg Should Learn From Horny 19th-Century Telegraph Operators

Slate

"Oh, stop it--you're making me blush," the throaty voice said, laughing off a compliment. Barret Zoph, who'd given the compliment, looked pleased. As he should--Zoph represents OpenAI, the company behind the voice. "We are looking at the future of interaction between ourselves and the machines," promised Mira Murati, OpenAI's chief technology officer. ChatGPT-4o is just one of a wave of new conversational A.I., including the rollout of Meta AI last month.


Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System

Schmucker, Robin, Xia, Meng, Azaria, Amos, Mitchell, Tom

arXiv.org Artificial Intelligence

Conversational tutoring systems (CTSs) offer learning experiences through interactions based on natural language. They are recognized for promoting cognitive engagement and improving learning outcomes, especially in reasoning tasks. Nonetheless, the cost associated with authoring CTS content is a major obstacle to widespread adoption and to research on effective instructional design. In this paper, we discuss and evaluate a novel type of CTS that leverages recent advances in large language models (LLMs) in two ways: First, the system enables AI-assisted content authoring by inducing an easily editable tutoring script automatically from a lesson text. Second, the system automates the script orchestration in a learning-by-teaching format via two LLM-based agents (Ruffle&Riley) acting as a student and a professor. The system allows for free-form conversations that follow the ITS-typical inner and outer loop structure. We evaluate Ruffle&Riley's ability to support biology lessons in two between-subject online user studies (N = 200) comparing the system to simpler QA chatbots and reading activity. Analyzing system usage patterns, pre/post-test scores and user experience surveys, we find that Ruffle&Riley users report high levels of engagement, understanding and perceive the offered support as helpful. Even though Ruffle&Riley users require more time to complete the activity, we did not find significant differences in short-term learning gains over the reading activity. Our system architecture and user study provide various insights for designers of future CTSs. We further open-source our system to support ongoing research on effective instructional design of LLM-based learning technologies.