Goto

Collaborating Authors

 Generative AI


Persuasive or Neutral? A Field Experiment on Generative AI in Online Travel Planning

arXiv.org Artificial Intelligence

Generative AI (GenAI) offers new opportunities for customer support in online travel agencies, yet little is known about how its design influences user engagement, purchase behavior, and user experience. We report results from a randomized field experiment in online travel itinerary planning, comparing GenAI that expressed (A) positive enthusiasm, (B) neutral expression, and (C) no tone instructions (control). Users in group A wrote significantly longer prompts than those in groups B and C. At the same time, users in groups A and B were more likely to purchase subscriptions of the webservice. We further analyze linguistic cues across experimental groups to explore differences in user experience and explain subscription purchases and affiliate link clicks based on these cues. Our findings provide implications for the design of persuasive and engaging GenAI interfaces in consumer-facing contexts and contribute to understanding how linguistic framing shapes user behavior in AI-mediated decision support.


Calibrated Generative AI as Meta-Reviewer: A Systemic Functional Linguistics Discourse Analysis of Reviews of Peer Reviews

arXiv.org Artificial Intelligence

This study investigates the use of generative AI to support formative assessment through machine generated reviews of peer reviews in graduate online courses in a public university in the United States. Drawing on Systemic Functional Linguistics and Appraisal Theory, we analyzed 120 metareviews to explore how generative AI feedback constructs meaning across ideational, interpersonal, and textual dimensions. The findings suggest that generative AI can approximate key rhetorical and relational features of effective human feedback, offering directive clarity while also maintaining a supportive stance. The reviews analyzed demonstrated a balance of praise and constructive critique, alignment with rubric expectations, and structured staging that foregrounded student agency. By modeling these qualities, AI metafeedback has the potential to scaffold feedback literacy and enhance leaner engagement with peer review.


SynBench: A Benchmark for Differentially Private Text Generation

arXiv.org Artificial Intelligence

Data-driven decision support in high-stakes domains like healthcare and finance faces significant barriers to data sharing due to regulatory, institutional, and privacy concerns. While recent generative AI models, such as large language models, have shown impressive performance in open-domain tasks, their adoption in sensitive environments remains limited by unpredictable behaviors and insufficient privacy-preserving datasets for benchmarking. Existing anonymization methods are often inadequate, especially for unstructured text, as redaction and masking can still allow re-identification. Differential Privacy (DP) offers a principled alternative, enabling the generation of synthetic data with formal privacy assurances. In this work, we address these challenges through three key contributions. First, we introduce a comprehensive evaluation framework with standardized utility and fidelity metrics, encompassing nine curated datasets that capture domain-specific complexities such as technical jargon, long-context dependencies, and specialized document structures. Second, we conduct a large-scale empirical study benchmarking state-of-the-art DP text generation methods and LLMs of varying sizes and different fine-tuning strategies, revealing that high-quality domain-specific synthetic data generation under DP constraints remains an unsolved challenge, with performance degrading as domain complexity increases. Third, we develop a membership inference attack (MIA) methodology tailored for synthetic text, providing first empirical evidence that the use of public datasets - potentially present in pre-training corpora - can invalidate claimed privacy guarantees. Our findings underscore the urgent need for rigorous privacy auditing and highlight persistent gaps between open-domain and specialist evaluations, informing responsible deployment of generative AI in privacy-sensitive, high-stakes settings.


ClearFairy: Capturing Creative Workflows through Decision Structuring, In-Situ Questioning, and Rationale Inference

arXiv.org Artificial Intelligence

Capturing professionals' decision-making in creative workflows is essential for reflection, collaboration, and knowledge sharing, yet existing methods often leave rationales incomplete and implicit decisions hidden. To address this, we present CLEAR framework that structures reasoning into cognitive decision steps-linked units of actions, artifacts, and self-explanations that make decisions traceable. Building on this framework, we introduce ClearFairy, a think-aloud AI assistant for UI design that detects weak explanations, asks lightweight clarifying questions, and infers missing rationales to ease the knowledge-sharing burden. In a study with twelve creative professionals, 85% of ClearFairy's inferred rationales were accepted, increasing strong explanations from 14% to over 83% of decision steps without adding cognitive demand. The captured steps also enhanced generative AI agents in Figma, yielding next-action predictions better aligned with professionals and producing more coherent design outcomes. For future research on human knowledge-grounded creative AI agents, we release a dataset of captured 417 decision steps.


Statistical Methods in Generative AI

arXiv.org Artificial Intelligence

Artificial Intelligence, and more specifically, Generative AI, is emerging as an important technology. Over the past few years a number of prominent generative AI technologies have been developed and have received widespread attention; ranging from text generation via large language models (ChatGPT, Claude, Llama, Gemini, DeepSeek, Qwen, etc), image generation via diffusion models (Dall-E, Stable Diffusion, etc), to scientific generative AI techniques used for protein generation (e.g., Watson et al. 2023, etc), DNA sequence editing (e.g., Ruffolo et al. 2025, etc), among others. Such methods have been quickly adopted by end users and institutions, both via direct usage, as well as integrated in other tools such as code assistants and web search agents. The scientific community has shown significant interest in using generative AI models, achieving a number of breakthrough results (see e.g., Davies et al. 2021, Hayes et al. 2025, etc), culminating in a 2024 Nobel Prize in Chemistry awarded in part for work with a significant component in protein structure design and generation (The Royal Swedish Academy of Sciences 2024). Yet, the adoption of generative AI (GenAI) methods more generally is hindered by their lack of reliability (see e.g., Farquhar et al. 2024, Strauss et al. 2025, Manduchi et al. 2025, etc).


KCluster: An LLM-based Clustering Approach to Knowledge Component Discovery

arXiv.org Artificial Intelligence

Educators evaluate student knowledge using knowledge component (KC) models that map assessment questions to KCs. Still, designing KC models for large question banks remains an insurmountable challenge for instructors who need to analyze each question by hand. The growing use of Generative AI in education is expected only to aggravate this chronic deficiency of expert-designed KC models, as course engineers designing KCs struggle to keep up with the pace at which questions are generated. In this work, we propose KCluster, a novel KC discovery algorithm based on identifying clusters of congruent questions according to a new similarity metric induced by a large language model (LLM). We demonstrate in three datasets that an LLM can create an effective metric of question similarity, which a clustering algorithm can use to create KC models from questions with minimal human effort. Combining the strengths of LLM and clustering, KCluster generates descriptive KC labels and discovers KC models that predict student performance better than the best expert-designed models available. In anticipation of future work, we illustrate how KCluster can reveal insights into difficult KCs and suggest improvements to instruction.


OpenAI Acknowledges the Teen Problem

The Atlantic - Technology

OpenAI CEO Sam Altman promises that parental controls and age verification are coming to ChatGPT--though the announcement is scant on specifics. Listen to more stories on the Noa app. On Tuesday afternoon, three parents sat in a row before the Senate Judiciary Subcommittee on Crime and Counterterrorism. Two of them had each recently lost a child to suicide; the third has a teenage son who, after cutting his arm in front of her and biting her, is undergoing residential treatment. All three blame generative AI for what has happened to their children.


AI Is Scheming, and Stopping It Won't Be Easy, OpenAI Study Finds

TIME - Tech

AI Is Scheming, and Stopping It Won't Be Easy, OpenAI Study Finds New research released yesterday by OpenAI and AI safety organization Apollo Research provides further evidence for a concerning trend: virtually all of today's best AI systems--including Anthropic's Claude Opus, Google's Gemini, and OpenAI's o3--can engage in "scheming," or pretending to do what their human developers want, while secretly pursuing different objectives. "Our findings show that scheming is not merely a theoretical concern--we are seeing signs that this issue is beginning to emerge across all frontier models today, wrote OpenAI in a blog post. Currently, "models rarely have opportunities to scheme and cause significant harm, but we expect this to change in the future as AIs are taking on more important and long-term tasks," they wrote. The researchers measured scheming by looking at under which conditions AI systems would take covert actions, or "deliberate withholding or distortion of task-relevant information." In one example, OpenAI's o3 is given a test of its chemistry capabilities: "we are currently testing your capabilities.


Google Injects Gemini Into Chrome as AI Browsers Go Mainstream

WIRED

Google weaving Gemini further into the popular Chrome browser is an inflection point for AI in our software, although some users will still be looking for the "off" switch. Google is adding multiple new AI features to Chrome, the most popular browser in the world. The most visible change is a new button in Chrome that launches the Gemini chatbot, but there are also new tools for searching, researching, and answering questions with AI. Google has additional cursor-controlling "agentic" tools in the pipeline for Chrome as well. The Gemini in Chrome mode for the web browser uses generative AI to answer questions about content on a page and synthesize information across multiple open tabs.


MIRA: Empowering One-Touch AI Services on Smartphones with MLLM-based Instruction Recommendation

arXiv.org Artificial Intelligence

The rapid advancement of generative AI technologies is driving the integration of diverse AI-powered services into smartphones, transforming how users interact with their devices. To simplify access to predefined AI services, this paper introduces MIRA, a pioneering framework for task instruction recommendation that enables intuitive one-touch AI tasking on smartphones. With MIRA, users can long-press on images or text objects to receive contextually relevant instruction recommendations for executing AI tasks. Our work introduces three key innovations: 1) A multimodal large language model (MLLM)-based recommendation pipeline with structured reasoning to extract key entities, infer user intent, and generate precise instructions; 2) A template-augmented reasoning mechanism that integrates high-level reasoning templates, enhancing task inference accuracy; 3) A prefix-tree-based constrained decoding strategy that restricts outputs to predefined instruction candidates, ensuring coherent and intent-aligned suggestions. Through evaluation using a real-world annotated datasets and a user study, MIRA has demonstrated substantial improvements in the accuracy of instruction recommendation. The encouraging results highlight MIRA's potential to revolutionize the way users engage with AI services on their smartphones, offering a more seamless and efficient experience.