AITopics | Yang, Diyi

Collaborating Authors

Yang, Diyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identifying and Mitigating the Security Risks of Generative AI

Barrett, Clark, Boyd, Brad, Burzstein, Elie, Carlini, Nicholas, Chen, Brad, Choi, Jihye, Chowdhury, Amrita Roy, Christodorescu, Mihai, Datta, Anupam, Feizi, Soheil, Fisher, Kathleen, Hashimoto, Tatsunori, Hendrycks, Dan, Jha, Somesh, Kang, Daniel, Kerschbaum, Florian, Mitchell, Eric, Mitchell, John, Ramzan, Zulfikar, Shams, Khawaja, Song, Dawn, Taly, Ankur, Yang, Diyi

arXiv.org Artificial IntelligenceDec-28-2023

Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1561/3300000041

2308.1484

Country:

Europe (1.00)
North America > United States > Wisconsin > Dane County > Madison (0.24)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Can Large Language Models Transform Computational Social Science?

Ziems, Caleb, Held, William, Shaikh, Omar, Chen, Jiaao, Zhang, Zhehao, Yang, Diyi

arXiv.org Artificial IntelligenceDec-7-2023

Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools. Towards this end, we contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 25 representative English CSS benchmarks. On taxonomic labeling tasks (classification), LLMs fail to outperform the best fine-tuned models but still achieve fair levels of agreement with humans. On free-form coding tasks (generation), LLMs produce explanations that often exceed the quality of crowdworkers' gold references. We conclude that the performance of today's LLMs can augment the CSS research pipeline in two ways: (1) serving as zero-shot data annotators on human annotation teams, and (2) bootstrapping challenging creative generation tasks (e.g., explaining the underlying attributes of a text). In summary, LLMs are posed to meaningfully participate in} social science analysis in partnership with humans.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.03514

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(3 more...)

Genre:

Research Report > New Finding (0.93)
Overview (0.88)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.87)

Add feedback

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

Liu, Yanchen, Held, William, Yang, Diyi

arXiv.org Artificial IntelligenceDec-5-2023

Existing large language models (LLMs) that mainly focus on Standard American English (SAE) often lead to significantly worse performance when being applied to other English dialects. While existing mitigations tackle discrepancies for individual target dialects, they assume access to high-accuracy dialect identification systems. The boundaries between dialects are inherently flexible, making it difficult to categorize language into discrete predefined categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features. The compositional architecture of DADA allows for both targeted adaptation to specific dialect variants and simultaneous adaptation to various dialects. We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.13406

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

Qin, Chengwei, Zhang, Aston, Zhang, Zhuosheng, Chen, Jiaao, Yasunaga, Michihiro, Yang, Diyi

arXiv.org Artificial IntelligenceNov-19-2023

Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot -- i.e., without adaptation on downstream data. Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community due to the fact that it can generate high-quality responses to human input and self-correct previous mistakes based on subsequent conversations. However, it is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot. In this work, we empirically analyze the zero-shot learning ability of ChatGPT by evaluating it on 20 popular NLP datasets covering 7 representative task categories. With extensive empirical studies, we demonstrate both the effectiveness and limitations of the current version of ChatGPT. We find that ChatGPT performs well on many tasks favoring reasoning capabilities (e.g., arithmetic reasoning) while it still faces challenges when solving specific tasks such as sequence tagging. We additionally provide in-depth analysis through qualitative case studies.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2302.06476

Country:

Asia (1.00)
Europe > United Kingdom > England (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Personal (1.00)
Research Report (0.81)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Consumer Health (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Scroll to Misbelief: Modeling the Unobservable Susceptibility to Misinformation on Social Media

Liu, Yanchen, Ma, Mingyu Derek, Qin, Wenna, Zhou, Azure, Chen, Jiaao, Shi, Weiyan, Wang, Wei, Yang, Diyi

arXiv.org Artificial IntelligenceNov-16-2023

Susceptibility to misinformation describes the extent to believe unverifiable claims, which is hidden in people's mental process and infeasible to observe. Existing susceptibility studies heavily rely on the self-reported beliefs, making any downstream applications on susceptability hard to scale. To address these limitations, in this work, we propose a computational model to infer users' susceptibility levels given their activities. Since user's susceptibility is a key indicator for their reposting behavior, we utilize the supervision from the observable sharing behavior to infer the underlying susceptibility tendency. The evaluation shows that our model yields estimations that are highly aligned with human judgment on users' susceptibility level comparisons. Building upon such large-scale susceptibility labeling, we further conduct a comprehensive analysis of how different social factors relate to susceptibility. We find that political leanings and psychological factors are associated with susceptibility in varying degrees.

large language model, machine learning, susceptibility, (19 more...)

arXiv.org Artificial Intelligence

2311.0963

Country:

Europe (0.67)
North America > United States > New York (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.71)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Grounding or Guesswork? Large Language Models are Presumptive Grounders

Shaikh, Omar, Gligorić, Kristina, Khetan, Ashna, Gerstgrasser, Matthias, Yang, Diyi, Jurafsky, Dan

arXiv.org Artificial IntelligenceNov-15-2023

Effective conversation requires common ground: a shared understanding between the participants. Common ground, however, does not emerge spontaneously in conversation. Speakers and listeners work together to both identify and construct a shared basis while avoiding misunderstanding. To accomplish grounding, humans rely on a range of dialogue acts, like clarification (What do you mean?) and acknowledgment (I understand.). In domains like teaching and emotional support, carefully constructing grounding prevents misunderstanding. However, it is unclear whether large language models (LLMs) leverage these dialogue acts in constructing common ground. To this end, we curate a set of grounding acts and propose corresponding metrics that quantify attempted grounding. We study whether LLMs use these grounding acts, simulating them taking turns from several dialogue datasets, and comparing the results to humans. We find that current LLMs are presumptive grounders, biased towards assuming common ground without using grounding acts. To understand the roots of this behavior, we examine the role of instruction tuning and reinforcement learning with human feedback (RLHF), finding that RLHF leads to less grounding. Altogether, our work highlights the need for more research investigating grounding in human-AI interaction.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.09144

Country:

North America > United States > New York (0.14)
North America > United States > Colorado (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Task-Agnostic Low-Rank Adapters for Unseen English Dialects

Xiao, Zedian, Held, William, Liu, Yanchen, Yang, Diyi

arXiv.org Artificial IntelligenceNov-1-2023

Large Language Models (LLMs) are trained on corpora disproportionally weighted in favor of Standard American English. As a result, speakers of other dialects experience significantly more failures when interacting with these technologies. In practice, these speakers often accommodate their speech to be better understood. Our work shares the belief that language technologies should be designed to accommodate the diversity in English dialects and not the other way around. However, prior works on dialect struggle with generalizing to evolving and emerging dialects in a scalable manner. To fill this gap, our method, HyperLoRA, leverages expert linguistic knowledge to enable resource-efficient adaptation via hypernetworks. By disentangling dialect-specific and cross-dialectal information, HyperLoRA improves generalization to unseen dialects in a task-agnostic fashion. Not only is HyperLoRA more scalable in the number of parameters, but it also achieves the best or most competitive performance across 5 dialects in a zero-shot setting. In this way, our approach facilitates access to language technology for billions of English dialect speakers who are traditionally underrepresented.

artificial intelligence, large language model, task-agnostic low-rank adapter, (2 more...)

arXiv.org Artificial Intelligence

2311.00915

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Add feedback

Unlearn What You Want to Forget: Efficient Unlearning for LLMs

Chen, Jiaao, Yang, Diyi

arXiv.org Artificial IntelligenceOct-30-2023

Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data, however, this process might suffer from privacy issues and violations of data protection regulations. As a result, the ability to easily remove data related to individual users from such models while not deteriorating their predictive quality after the removal becomes increasingly important. To address these issues, in this work, we propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals, by introducing lightweight unlearning layers learned with a selective teacher-student objective into the transformers. In addition, we introduce a fusion mechanism to effectively combine different unlearning layers that learns to forget different sets of data to handle a sequence of forgetting operations. Experiments on classification and generation tasks demonstrate the effectiveness of our proposed methods compared to the state-of-the-art baselines.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.2015

Country:

Asia (0.28)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Training Socially Aligned Language Models on Simulated Social Interactions

Liu, Ruibo, Yang, Ruixin, Jia, Chenyan, Zhang, Ge, Zhou, Denny, Dai, Andrew M., Yang, Diyi, Vosoughi, Soroush

arXiv.org Artificial IntelligenceOct-28-2023

Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attacks. This work presents a novel training paradigm that permits LMs to learn from simulated social interactions. In comparison to existing methodologies, our approach is considerably more scalable and efficient, demonstrating superior performance in alignment benchmarks and human evaluations. This paradigm shift in the training of LMs brings us a step closer to developing AI systems that can robustly and accurately reflect societal norms and values. "We want AI agents that can discover like we can, not which contain what we have discovered." Richard Sutton, The Bitter Lesson, 2019 By virtue of their ability to "predict the next token(s)", contemporary pre-trained language models (LMs) have shown remarkable proficiency in memorizing extensive corpora, thereby enabling the generation of text indistinguishable from human-produced content (Brown et al., 2020). However, successful memorization of human knowledge does not assure a model's propensity to perform as per societal expectations. Recent research has exposed behavioral anomalies in these LMs (Weidinger et al., 2022), which include the generation of harmful content (Gehman et al., 2020; Bommasani et al., 2021), the reinforcement of bias (Venkit et al., 2022; Liu et al., 2022), and the dissemination of disinformation (Tamkin et al., 2021; Lin et al., 2022). This process of enhancing desirable societal behaviors and inhibiting undesirable ones is commonly referred to as "social alignment" (Gabriel, 2020; Taylor et al., 2016). Supervised Fine-Tuning (SFT) presents a straightforward method for achieving alignment by training LMs using socially aligned data (Figure 1 [a]). However, this method often yields models susceptible to adversarial attacks, like "jailbreaking prompting" (Subhash, 2023; Xu et al., 2021), due to limited exposure to misaligned data during training (Amodei et al., 2016). To address this, a more advanced technique, "reward modeling" has been proposed (Leike et al., 2018; Christiano et al., 2017). This involves training a reward model as a surrogate for human judgment to guide the optimization of the LM (e.g., OpenAI's RLHF, Figure 1 [b]).

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.1696

Country:

North America > United States > Michigan (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (0.86)
Government (0.86)
Media (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Impressions: Understanding Visual Semiotics and Aesthetic Impact

Kruk, Julia, Ziems, Caleb, Yang, Diyi

arXiv.org Artificial IntelligenceOct-27-2023

Is aesthetic impact different from beauty? Is visual salience a reflection of its capacity for effective communication? We present Impressions, a novel dataset through which to investigate the semiotics of images, and how specific visual features and design choices can elicit specific emotions, thoughts and beliefs. We posit that the impactfulness of an image extends beyond formal definitions of aesthetics, to its success as a communicative act, where style contributes as much to meaning formation as the subject matter. However, prior image captioning datasets are not designed to empower state-of-the-art architectures to model potential human impressions or interpretations of images. To fill this gap, we design an annotation task heavily inspired by image analysis techniques in the Visual Arts to collect 1,440 image-caption pairs and 4,320 unique annotations exploring impact, pragmatic image description, impressions, and aesthetic design choices. We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images. However, this dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.

caption, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2310.17887

Country:

North America > United States > New York (0.14)
Europe > United Kingdom > Scotland (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Media > News (0.68)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
(3 more...)

Add feedback