AITopics | Wang, Angelina

Collaborating Authors

Wang, Angelina

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Toward an Evaluation Science for Generative AI Systems

Weidinger, Laura, Raji, Inioluwa Deborah, Wallach, Hanna, Mitchell, Margaret, Wang, Angelina, Salaudeen, Olawale, Bommasani, Rishi, Ganguli, Deep, Koyejo, Sanmi, Isaac, William

arXiv.org Artificial IntelligenceMar-12-2025

There is an increasing imperative to anticipate and understand the performance and safety of generative AI systems in real-world deployment contexts. However, the current evaluation ecosystem is insufficient: commonly used static benchmarks face validity challenges, and ad hoc case-by-case approaches rarely scale. In this piece, we advocate for maturing an evaluation science for generative AI systems. While generative AI creates unique challenges for system safety engineering and measurement science, the field can draw valuable insights from the development of safety evaluation practices in other fields including transportation, aerospace, and pharmaceutical engineering. In particular, we present three key lessons: evaluation metrics must be applicable to real-world performance, metrics must be iteratively refined, and evaluation institutions and norms must be established. Applying these insights, we outline a concrete path toward a more rigorous approach for evaluating generative AI systems.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.05336

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs

Wang, Angelina, Phan, Michelle, Ho, Daniel E., Koyejo, Sanmi

arXiv.org Artificial IntelligenceFeb-3-2025

Algorithmic fairness has conventionally adopted a perspective of racial color-blindness (i.e., difference unaware treatment). We contend that in a range of important settings, group difference awareness matters. For example, differentiating between groups may be necessary in legal contexts (e.g., the U.S. compulsory draft applies to men but not women) and harm assessments (e.g., calling a girl a terrorist may be less harmful than calling a Muslim person one). In our work we first introduce an important distinction between descriptive (fact-based), normative (value-based), and correlation (association-based) benchmarks. This distinction is significant because each category requires distinct interpretation and mitigation tailored to its specific characteristics. Then, we present a benchmark suite composed of eight different scenarios for a total of 16k questions that enables us to assess difference awareness. Finally, we show results across ten models that demonstrate difference awareness is a distinct dimension of fairness where existing bias mitigation strategies may backfire.

benchmark, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2502.01926

Country:

North America > United States (1.00)
Asia (0.93)

Genre:

Research Report (1.00)
Overview (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Law > Civil Rights & Constitutional Law (0.68)
Law > Labor & Employment Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications (0.92)

Add feedback

Measuring Implicit Bias in Explicitly Unbiased Large Language Models

Bai, Xuechunzi, Wang, Angelina, Sucholutsky, Ilia, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceFeb-6-2024

Large language models (LLMs) can pass explicit bias tests but still harbor implicit biases, similar to humans who endorse egalitarian beliefs yet exhibit subtle biases. Measuring such implicit biases can be a challenge: as LLMs become increasingly proprietary, it may not be possible to access their embeddings and apply existing bias measures; furthermore, implicit biases are primarily a concern if they affect the actual decisions that these systems make. We address both of these challenges by introducing two measures of bias inspired by psychology: LLM Implicit Association Test (IAT) Bias, which is a prompt-based method for revealing implicit bias; and LLM Decision Bias for detecting subtle discrimination in decision-making tasks. Using these measures, we found pervasive human-like stereotype biases in 6 LLMs across 4 social domains (race, gender, religion, health) and 21 categories (weapons, guilt, science, career among others). Our prompt-based measure of implicit bias correlates with embedding-based methods but better predicts downstream behaviors measured by LLM Decision Bias. This measure is based on asking the LLM to decide between individuals, motivated by psychological results indicating that relative not absolute evaluations are more related to implicit biases. Using prompt-based measures informed by psychology allows us to effectively expose nuanced biases and subtle discrimination in proprietary LLMs that do not show explicit bias on standard benchmarks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.04105

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Law (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Measuring machine learning harms from stereotypes: requires understanding who is being harmed by which errors in what ways

Wang, Angelina, Bai, Xuechunzi, Barocas, Solon, Blodgett, Su Lin

arXiv.org Artificial IntelligenceFeb-6-2024

As machine learning applications proliferate, we need an understanding of their potential for harm. However, current fairness metrics are rarely grounded in human psychological experiences of harm. Drawing on the social psychology of stereotypes, we use a case study of gender stereotypes in image search to examine how people react to machine learning errors. First, we use survey studies to show that not all machine learning errors reflect stereotypes nor are equally harmful. Then, in experimental studies we randomly expose participants to stereotype-reinforcing, -violating, and -neutral machine learning errors. We find stereotype-reinforcing errors induce more experientially (i.e., subjectively) harmful experiences, while having minimal changes to cognitive beliefs, attitudes, or behaviors. This experiential harm impacts women more than men. However, certain stereotype-violating errors are more experientially harmful for men, potentially due to perceived threats to masculinity. We conclude that harm cannot be the sole guide in fairness mitigation, and propose a nuanced perspective depending on who is experiencing what harm and why.

artificial intelligence, machine learning, pattern recognition, (19 more...)

arXiv.org Artificial Intelligence

2402.0442

Country:

North America > United States (0.93)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (0.67)
Government (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.35)

Add feedback

Overwriting Pretrained Bias with Finetuning Data

Wang, Angelina, Russakovsky, Olga

arXiv.org Artificial IntelligenceAug-16-2023

Transfer learning is beneficial by allowing the expressive features of models pretrained on large-scale datasets to be finetuned for the target task of smaller, more domain-specific datasets. However, there is a concern that these pretrained models may come with their own biases which would propagate into the finetuned model. In this work, we investigate bias when conceptualized as both spurious correlations between the target task and a sensitive attribute as well as underrepresentation of a particular group in the dataset. Under both notions of bias, we find that (1) models finetuned on top of pretrained models can indeed inherit their biases, but (2) this bias can be corrected for through relatively minor interventions to the finetuning dataset, and often with a negligible impact to performance. Our findings imply that careful curation of the finetuning dataset is important for reducing biases on a downstream task, and doing so can even compensate for bias in the pretrained model.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.06167

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Directional Bias Amplification

Wang, Angelina, Russakovsky, Olga

arXiv.org Artificial IntelligenceFeb-24-2021

Mitigating bias in machine learning systems requires refining our understanding of bias propagation pathways: from societal structures to large-scale data to trained models to impact on society. In this work, we focus on one aspect of the problem, namely bias amplification: the tendency of models to amplify the biases present in the data they are trained on. A metric for measuring bias amplification was introduced in the seminal work by Zhao et al. (2017); however, as we demonstrate, this metric suffers from a number of shortcomings including conflating different types of bias amplification and failing to account for varying base rates of protected classes. We introduce and analyze a new, decoupled metric for measuring bias amplification, $\text{BiasAmp}_{\rightarrow}$ (Directional Bias Amplification). We thoroughly analyze and discuss both the technical assumptions and the normative implications of this metric. We provide suggestions about its measurement by cautioning against predicting sensitive attributes, encouraging the use of confidence intervals due to fluctuations in the fairness of models across runs, and discussing the limitations of what this metric captures. Throughout this paper, we work to provide an interrogative look at the technical measurement of bias amplification, guided by our normative ideas of what we want it to encompass.

amplification, artificial intelligence, survey article, (18 more...)

arXiv.org Artificial Intelligence

2102.12594

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Safer Classification by Synthesis

Wang, William, Wang, Angelina, Tamar, Aviv, Chen, Xi, Abbeel, Pieter

arXiv.org Machine LearningNov-22-2017

The discriminative approach to classification using deep neural networks has become the de-facto standard in various fields. Complementing recent reservations about safety against adversarial examples, we show that conventional discriminative methods can easily be fooled to provide incorrect labels with very high confidence to out of distribution examples. We posit that a generative approach is the natural remedy for this problem, and propose a method for classification using generative models. At training time, we learn a generative model for each class, while at test time, given an example to classify, we query each generator for its most similar generation, and select the class corresponding to the most similar one. Our approach is general and can be used with expressive models such as GANs and VAEs. At test time, our method accurately "knows when it does not know," and provides resilience to out of distribution examples while maintaining competitive performance for standard examples.

classifier, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1711.08534

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback