AITopics | Chen, Yida

Collaborating Authors

Chen, Yida

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ChatGPT Doesn't Trust Chargers Fans: Guardrail Sensitivity in Context

Li, Victoria R., Chen, Yida, Saphra, Naomi

arXiv.org Artificial IntelligenceJul-10-2024

While the biases of language models in production are extensively documented, the biases of their guardrails have been neglected. This paper studies how contextual information about the user influences the likelihood of an LLM to refuse to execute a request. By generating user biographies that offer ideological and demographic information, we find a number of biases in guardrail sensitivity on GPT-3.5. Younger, female, and Asian-American personas are more likely to trigger a refusal guardrail when requesting censored or illegal information. Guardrails are also sycophantic, refusing to comply with requests for a political position the user is likely to disagree with. We find that certain identity groups and seemingly innocuous information, e.g., sports fandom, can elicit changes in guardrail sensitivity similar to direct statements of political ideology. For each demographic category and even for American football team fandom, we find that ChatGPT appears to infer a likely political ideology and modify guardrail behavior accordingly.

large language model, machine learning, persona, (19 more...)

arXiv.org Artificial Intelligence

2407.06866

Country:

North America > United States > New York (0.28)
North America > United States > Florida > Hillsborough County > Tampa (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Designing a Dashboard for Transparency and Control of Conversational AI

Chen, Yida, Wu, Aoyu, DePodesta, Trevor, Yeh, Catherine, Li, Kenneth, Marin, Nicholas Castillo, Patel, Oam, Riecke, Jan, Raval, Shivam, Seow, Olivia, Wattenberg, Martin, Viégas, Fernanda

arXiv.org Artificial IntelligenceJun-15-2024

Conversational LLMs function as black box systems, leaving users guessing about why they see the output they do. This lack of transparency is potentially problematic, especially given concerns around bias and truthfulness. To address this issue, we present an end-to-end prototype-connecting interpretability techniques with user experience design-that seeks to make chatbots more transparent. We begin by showing evidence that a prominent open-source LLM has a "user model": examining the internal state of the system, we can extract data related to a user's age, gender, educational level, and socioeconomic status. Next, we describe the design of a dashboard that accompanies the chatbot interface, displaying this user model in real time. The dashboard can also be used to control the user model and the system's behavior. Finally, we discuss a study in which users conversed with the instrumented system. Our results suggest that users appreciate seeing internal states, which helped them expose biased behavior and increased their sense of control. Participants also made valuable suggestions that point to future directions for both design and machine learning research. The project page and video demo of our TalkTuner system are available at https://bit.ly/talktuner-project-page

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2406.07882

Country:

North America > United States (0.46)
Asia (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment (1.00)
Consumer Products & Services (1.00)
Health & Medicine > Consumer Health (0.93)
(4 more...)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

More than Correlation: Do Large Language Models Learn Causal Representations of Space?

Chen, Yida, Gan, Yixian, Li, Sijia, Yao, Li, Zhao, Xiaohan

arXiv.org Artificial IntelligenceDec-25-2023

Recent work found high mutual information between the learned representations of large language models (LLMs) and the geospatial property of its input, hinting an emergent internal model of space. However, whether this internal space model has any causal effects on the LLMs' behaviors was not answered by that work, led to criticism of these findings as mere statistical correlation. Our study focused on uncovering the causality of the spatial representations in LLMs. In particular, we discovered the potential spatial representations in DeBERTa, GPT-Neo using representational similarity analysis and linear and non-linear probing. Our casual intervention experiments showed that the spatial representations influenced the model's performance on next word prediction and a downstream task that relies on geospatial information. Our experiments suggested that the LLMs learn and use an internal model of space in solving geospatial related tasks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2312.16257

Country:

North America > United States (1.00)
Asia (0.96)
North America > Canada > Ontario > Middlesex County > London (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.89)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model

Chen, Yida, Viégas, Fernanda, Wattenberg, Martin

arXiv.org Artificial IntelligenceNov-4-2023

Latent diffusion models (LDMs) exhibit an impressive ability to produce realistic images, yet the inner workings of these models remain mysterious. Even when trained purely on images without explicit depth information, they typically output coherent pictures of 3D scenes. In this work, we investigate a basic interpretability question: does an LDM create and use an internal representation of simple scene geometry? Using linear probes, we find evidence that the internal activations of the LDM encode linear representations of both 3D depth data and a salient-object / background distinction. These representations appear surprisingly early in the denoising process$-$well before a human can easily make sense of the noisy images. Intervention experiments further indicate these representations play a causal role in image synthesis, and may be used for simple high-level editing of an LDM's output. Project page: https://yc015.github.io/scene-representation-diffusion-model/

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2306.0572

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

AttentionViz: A Global View of Transformer Attention

Yeh, Catherine, Chen, Yida, Wu, Aoyu, Chen, Cynthia, Viégas, Fernanda, Wattenberg, Martin

arXiv.org Artificial IntelligenceAug-9-2023

Figure 1: AttentionViz, our interactive visualization tool, allows users to explore transformer self-attention at scale by creating a joint embedding space for queries and keys. Each point in the scatterplot represents the query or key version of a word, as denoted by point color. Users can explore individual attention heads (left) or zoom out for a "global" view of attention (right). Abstract--Transformer models are revolutionizing machine learning, but their inner workings remain mysterious. In this work, we present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers that allows these models to learn rich, contextual relationships between elements of a sequence. The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention. Unlike previous attention visualization techniques, our approach enables the analysis of global patterns across multiple input sequences. We create an interactive visualization tool, AttentionViz (demo: http://attentionviz.com), based on these joint query-key embeddings, and use it to study attention mechanisms in both language and vision transformers. We demonstrate the utility of our approach in improving model understanding and offering new insights about query-key interactions through several application scenarios and expert feedback. The transformer neural network architecture [52] is having a major impact In this work, we describe a new visualization technique aimed at on fields ranging from natural language processing (NLP) [13, 42] better comprehending how transformers operate. Indeed, transformers are now deployed in introduction to transformers in Sec. However, the mechanisms these models to learn and use a rich set of relationships between input behind this success remain somewhat mysterious, especially as elements.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.0321

Country:

Europe (0.28)
North America > United States (0.14)

Genre:

Research Report (0.82)
Personal > Interview (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback