AITopics | Shi, Weiyan

Collaborating Authors

Shi, Weiyan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Proactive Conversational Agents with Inner Thoughts

Liu, Xingyu Bruce, Fang, Shitao, Shi, Weiyan, Wu, Chien-Sheng, Igarashi, Takeo, Chen, Xiang `Anthony'

arXiv.org Artificial IntelligenceDec-31-2024

One of the long-standing aspirations in conversational AI is to allow them to autonomously take initiatives in conversations, i.e., being proactive. This is especially challenging for multi-party conversations. Prior NLP research focused mainly on predicting the next speaker from contexts like preceding conversations. In this paper, we demonstrate the limitations of such methods and rethink what it means for AI to be proactive in multi-party, human-AI conversations. We propose that just like humans, rather than merely reacting to turn-taking cues, a proactive AI formulates its own inner thoughts during a conversation, and seeks the right moment to contribute. Through a formative study with 24 participants and inspiration from linguistics and cognitive psychology, we introduce the Inner Thoughts framework. Our framework equips AI with a continuous, covert train of thoughts in parallel to the overt communication process, which enables it to proactively engage by modeling its intrinsic motivation to express these thoughts. We instantiated this framework into two real-time systems: an AI playground web app and a chatbot. Through a technical evaluation and user studies with human participants, our framework significantly surpasses existing baselines on aspects like anthropomorphism, coherence, intelligence, and turn-taking appropriateness.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.00383

Country:

Asia (0.67)
North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (1.00)

Industry:

Health & Medicine > Consumer Health (0.69)
Information Technology > Software (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews

Lu, Michael, Cho, Hyundong Justin, Shi, Weiyan, May, Jonathan, Spangher, Alexander

arXiv.org Artificial IntelligenceNov-20-2024

Large Language Models (LLMs) have demonstrated impressive capabilities in generating coherent text but often struggle with grounding language and strategic dialogue. To address this gap, we focus on journalistic interviews, a domain rich in grounding communication and abundant in data. We curate a dataset of 40,000 two-person informational interviews from NPR and CNN, and reveal that LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions. Realizing that a fundamental deficit exists in multi-turn planning and strategic thinking, we develop a realistic simulated environment, incorporating source personas and persuasive elements, in order to facilitate the development of agents with longer-horizon rewards. Our experiments show that while source LLMs mimic human behavior in information sharing, interviewer LLMs struggle with recognizing when questions are answered and engaging persuasively, leading to suboptimal information extraction across model size and capability. These findings underscore the need for enhancing LLMs' strategic dialogue capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.13779

Country:

Europe (0.67)
North America > United States > California (0.28)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (1.00)

Industry:

Government (1.00)
Information Technology (0.93)
Education (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distilling an End-to-End Voice Assistant Without Instruction Training Data

Held, William, Li, Ella, Ryan, Michael, Shi, Weiyan, Zhang, Yanzhe, Yang, Diyi

arXiv.org Artificial IntelligenceOct-3-2024

Voice assistants, such as Siri and Google Assistant, typically model audio and text separately, resulting in lost speech information and increased complexity. Recent efforts to address this with end-to-end Speech Large Language Models (LLMs) trained with supervised finetuning (SFT) have led to models "forgetting" capabilities from text-only LLMs. Our work proposes an alternative paradigm for training Speech LLMs without instruction data, using the response of a text-only LLM to transcripts as self-supervision. Importantly, this process can be performed without annotated responses. We show that our Distilled Voice Assistant (DiVA) generalizes to Spoken Question Answering, Classification, and Translation. Furthermore, we show that DiVA better meets user preferences, achieving a 72% win rate compared with state-of-the-art models like Qwen 2 Audio, despite using >100x less training compute. Figure 1: Training Pipeline for Distilled Voice Assistant (DiVA), Red indicates trainable components while Blue indicates frozen pretrained modules. DiVA modifies a text-only LLM into a general purpose Speech LLM by using the model's own responses to transcribed speech as self-supervision. As Large Language Models (LLMs) capabilities increase, so does the value of bringing these capabilities to new modalities, including audio and speech (Shu et al., 2023; Wang et al., 2023; Gong et al., 2023). Speech is a natural interaction surface for language technology (Murad et al., 2019), offering measurable efficiency gains for users (Ruan et al., 2018). One straightforward method of integrating speech with LLMs is to feed audio to an Automatic Speech Recognition (ASR) model and produce a text transcription for the LLM to use. All authors besides first and last sorted alphabetically.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.02678

Country:

Asia (1.00)
Europe (0.93)
North America > United States (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Contrastive Learning for Knowledge-Based Question Generation in Large Language Models

Zhang, Zhenhong, Chen, Jiajing, Shi, Weiyan, Yi, Lingjie, Wang, Chihang, Yu, Qian

arXiv.org Artificial IntelligenceSep-26-2024

With the rapid development of artificial intelligence technology, especially the increasingly widespread application of question-and-answer systems, high-quality question generation has become a key component in supporting the development of these systems. This article focuses on knowledge-based question generation technology, which aims to enable computers to simulate the human questioning process based on understanding specific texts or knowledge bases. In light of the issues of hallucination and knowledge gaps present in large-scale language models when applied to knowledge-intensive tasks, this paper proposes an enhanced question generation method that incorporates contrastive learning. This method utilizes multiple models to jointly mine domain knowledge and uses contrastive learning to guide the model in reducing noise and hallucinations in generation. Experimental results show that by designing prompts containing contrasting examples, the model's performance in question generation improves considerably, particularly when contrasting instructions and examples are used simultaneously, leading to the highest quality of generated questions and improved accuracy. These results demonstrate that the method proposed in this study, which combines contrasting context and chain-of-thought prompts, can effectively improve both the quality and the practicality of question generation.

large language model, machine learning, question answering, (16 more...)

arXiv.org Artificial Intelligence

2409.13994

Country: North America > United States (0.70)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-shot Persuasive Chatbots with LLM-Generated Strategies and Information Retrieval

Furumai, Kazuaki, Legaspi, Roberto, Vizcarra, Julio, Yamazaki, Yudai, Nishimura, Yasutaka, Semnani, Sina J., Ikeda, Kazushi, Shi, Weiyan, Lam, Monica S.

arXiv.org Artificial IntelligenceJul-3-2024

Persuasion plays a pivotal role in a wide range of applications from health intervention to the promotion of social good. Persuasive chatbots can accelerate the positive effects of persuasion in such applications. Existing methods rely on fine-tuning persuasive chatbots with task-specific training data which is costly, if not infeasible, to collect. To address this issue, we propose a method to leverage the generalizability and inherent persuasive abilities of large language models (LLMs) in creating effective and truthful persuasive chatbot for any given domain in a zero-shot manner. Unlike previous studies which used pre-defined persuasion strategies, our method first uses an LLM to generate responses, then extracts the strategies used on the fly, and replaces any unsubstantiated claims in the response with retrieved facts supporting the strategies. We applied our chatbot, PersuaBot, to three significantly different domains needing persuasion skills: donation solicitation, recommendations, and health intervention. Our experiments on simulated and human conversations show that our zero-shot approach is more persuasive than prior work, while achieving factual accuracy surpassing state-of-the-art knowledge-oriented chatbots. Our study demonstrated that when persuasive chatbots are employed responsibly for social good, it is an enabler of positive individual and social change.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.03585

Country:

Asia (1.00)
Europe (0.93)
North America > United States > California > Santa Clara County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Social Sector (1.00)
Health & Medicine > Epidemiology (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies

Shi, Weiyan, Li, Ryan, Zhang, Yutong, Ziems, Caleb, yu, Chunhua, Horesh, Raya, de Paula, Rogério Abreu, Yang, Diyi

arXiv.org Artificial IntelligenceApr-23-2024

To enhance language models' cultural awareness, we design a generalizable pipeline to construct cultural knowledge bases from different online communities on a massive scale. With the pipeline, we construct CultureBank, a knowledge base built upon users' self-narratives with 12K cultural descriptors sourced from TikTok and 11K from Reddit. Unlike previous cultural knowledge resources, CultureBank contains diverse views on cultural descriptors to allow flexible interpretation of cultural knowledge, and contextualized cultural scenarios to help grounded evaluation. With CultureBank, we evaluate different LLMs' cultural awareness, and identify areas for improvement. We also fine-tune a language model on CultureBank: experiments show that it achieves better performances on two downstream cultural tasks in a zero-shot setting. Finally, we offer recommendations based on our findings for future culturally aware language technologies. The project page is https://culturebank.github.io . The code and model is at https://github.com/SALT-NLP/CultureBank . The released CultureBank dataset is at https://huggingface.co/datasets/SALT-NLP/CultureBank .

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.15238

Country:

Europe (0.93)
Asia (0.68)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Services (1.00)
Media > Music (0.93)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

A Safe Harbor for AI Evaluation and Red Teaming

Longpre, Shayne, Kapoor, Sayash, Klyman, Kevin, Ramaswami, Ashwin, Bommasani, Rishi, Blili-Hamelin, Borhane, Huang, Yangsibo, Skowron, Aviya, Yong, Zheng-Xin, Kotha, Suhas, Zeng, Yi, Shi, Weiyan, Yang, Xianjun, Southen, Reid, Robey, Alexander, Chao, Patrick, Yang, Diyi, Jia, Ruoxi, Kang, Daniel, Pentland, Sandy, Narayanan, Arvind, Liang, Percy, Henderson, Peter

arXiv.org Artificial IntelligenceMar-7-2024

Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. Although some companies offer researcher access programs, they are an inadequate substitute for independent research access, as they have limited community representation, receive inadequate funding, and lack independence from corporate incentives. We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal. These proposals emerged from our collective experience conducting safety, privacy, and trustworthiness research on generative AI systems, where norms and incentives could be better aligned with public interests, without exacerbating model misuse. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2403.04893

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.79)

Add feedback

Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future

Li, Minzhi, Shi, Weiyan, Ziems, Caleb, Yang, Diyi

arXiv.org Artificial IntelligenceFeb-27-2024

As Natural Language Processing (NLP) systems become increasingly integrated into human social life, these technologies will need to increasingly rely on social intelligence. Although there are many valuable datasets that benchmark isolated dimensions of social intelligence, there does not yet exist any body of work to join these threads into a cohesive subfield in which researchers can quickly identify research gaps and future directions. Towards this goal, we build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets. Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models' performance in different social intelligence aspects. Our analyses demonstrate its Figure 1: Our Social Intelligence Data Infrastructure utility in enabling a thorough understanding of gives a comprehensive overview and synthesis of social current data landscape and providing a holistic intelligence in NLP, with a theoretically grounded taxonomy perspective on potential directions for future and an NLP data library. Researchers can use dataset development. We show there is a need our infrastructure to build and organize tasks, evaluate for multifaceted datasets, increased diversity in language models and derive future insights.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2403.14659

Country: North America > United States > California (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.93)
Education (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)

Add feedback

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Ko, Myeongseob, Kang, Feiyang, Shi, Weiyan, Jin, Ming, Yu, Zhou, Jia, Ruoxi

arXiv.org Machine LearningFeb-13-2024

Large-scale black-box models have become ubiquitous across numerous applications. Understanding the influence of individual training data sources on predictions made by these models is crucial for improving their trustworthiness. Current influence estimation techniques involve computing gradients for every training point or repeated training on different subsets. These approaches face obvious computational challenges when scaled up to large datasets and models. In this paper, we introduce and explore the Mirrored Influence Hypothesis, highlighting a reciprocal nature of influence between training and test data. Specifically, it suggests that evaluating the influence of training data on test predictions can be reformulated as an equivalent, yet inverse problem: assessing how the predictions for training samples would be altered if the model were trained on specific test samples. Through both empirical and theoretical validations, we demonstrate the wide applicability of our hypothesis. Inspired by this, we introduce a new method for estimating the influence of training data, which requires calculating gradients for specific test samples, paired with a forward pass for each training point. This approach can capitalize on the common asymmetry in scenarios where the number of test samples under concurrent examination is much smaller than the scale of the training dataset, thus gaining a significant improvement in efficiency compared to existing approaches. We demonstrate the applicability of our method across a range of scenarios, including data attribution in diffusion models, data leakage detection, analysis of memorization, mislabeled data detection, and tracing behavior in language models. Our code will be made available at https://github.com/ruoxi-jia-group/Forward-INF.

artificial intelligence, machine learning, tst, (19 more...)

arXiv.org Machine Learning

2402.08922

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.93)

Industry:

Government (0.46)
Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs

Zeng, Yi, Lin, Hongpeng, Zhang, Jingwen, Yang, Diyi, Jia, Ruoxi, Shi, Weiyan

arXiv.org Artificial IntelligenceJan-23-2024

Most traditional AI safety research has approached AI models as machines and centered on algorithm-focused attacks developed by security experts. As large language models (LLMs) become increasingly common and competent, non-expert users can also impose risks during daily interactions. This paper introduces a new perspective to jailbreak LLMs as human-like communicators, to explore this overlooked intersection between everyday language interaction and AI safety. Specifically, we study how to persuade LLMs to jailbreak them. First, we propose a persuasion taxonomy derived from decades of social science research. Then, we apply the taxonomy to automatically generate interpretable persuasive adversarial prompts (PAP) to jailbreak LLMs. Results show that persuasion significantly increases the jailbreak performance across all risk categories: PAP consistently achieves an attack success rate of over $92\%$ on Llama 2-7b Chat, GPT-3.5, and GPT-4 in $10$ trials, surpassing recent algorithm-focused attacks. On the defense side, we explore various mechanisms against PAP and, found a significant gap in existing defenses, and advocate for more fundamental mitigation for highly interactive LLMs

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2401.06373

Country: North America > United States (0.92)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback