AITopics | Fleisig, Eve

Plotting

Fleisig, Eve

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration

Sung, Yoo Yeon, Fleisig, Eve, Hou, Yu, Upadhyay, Ishan, Boyd-Graber, Jordan Lee

arXiv.org Artificial IntelligenceFeb-26-2025

Language models are often miscalibrated, leading to confidently incorrect answers. We introduce GRACE, a benchmark for language model calibration that incorporates comparison with human calibration. GRACE consists of question-answer pairs, in which each question contains a series of clues that gradually become easier, all leading to the same answer; models must answer correctly as early as possible as the clues are revealed. This setting permits granular measurement of model calibration based on how early, accurately, and confidently a model answers. After collecting these questions, we host live human vs. model competitions to gather 1,749 data points on human and model teams' timing, accuracy, and confidence. We propose a metric, CalScore, that uses GRACE to analyze model calibration errors and identify types of model miscalibration that differ from human behavior. We find that although humans are less accurate than models, humans are generally better calibrated. Since state-of-the-art models struggle on GRACE, it effectively evaluates progress on improving model calibration.

calibration, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.19684

Country:

North America > United States (0.93)
Asia > Middle East (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment > Games (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Accurate and Data-Efficient Toxicity Prediction when Annotators Disagree

Jaggi, Harbani, Murali, Kashyap, Fleisig, Eve, Bıyık, Erdem

arXiv.org Artificial IntelligenceOct-16-2024

When annotators disagree, predicting the labels given by individual annotators can capture nuances overlooked by traditional label aggregation. We introduce three approaches to predicting individual annotator ratings on the toxicity of text by incorporating individual annotator-specific information: a neural collaborative filtering (NCF) approach, an in-context learning (ICL) approach, and an intermediate embedding-based architecture. We also study the utility of demographic information for rating prediction. NCF showed limited utility; however, integrating annotator history, demographics, and survey information permits both the embedding-based architecture and ICL to substantially improve prediction accuracy, with the embedding-based architecture outperforming the other methods. We also find that, if demographics are predicted from survey information, using these imputed demographics as features performs comparably to using true demographic data. This suggests that demographics may not provide substantial information for modeling ratings beyond what is captured in survey responses. Our findings raise considerations about the relative utility of different types of annotator information and provide new approaches for modeling annotators in subjective NLP tasks.

information, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2410.12217

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Add feedback

ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks

Sung, Yoo Yeon, Fleisig, Eve, Mondal, Ishani, Boyd-Graber, Jordan Lee

arXiv.org Artificial IntelligenceJun-24-2024

Adversarial benchmarks validate model abilities by providing samples that fool models but not humans. However, despite the proliferation of datasets that claim to be adversarial, there does not exist an established metric to evaluate how adversarial these datasets are. To address this lacuna, we introduce ADVSCORE, a metric which quantifies how adversarial and discriminative an adversarial dataset is and exposes the features that make data adversarial. We then use ADVSCORE to underpin a dataset creation pipeline that incentivizes writing a high-quality adversarial dataset. As a proof of concept, we use ADVSCORE to collect an adversarial question answering (QA) dataset, ADVQA, from our pipeline. The high-quality questions in ADVQA surpasses three adversarial benchmarks across domains at fooling several models but not humans. We validate our result based on difficulty estimates from 9,347 human responses on four datasets and predictions from three models. Moreover, ADVSCORE uncovers which adversarial tactics used by human writers fool models (e.g., GPT-4) but not humans. Through ADVSCORE and its analyses, we offer guidance on revealing language model vulnerabilities and producing reliable adversarial examples.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2406.16342

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Education (0.93)
Government (0.67)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.49)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
(2 more...)

Add feedback

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

Fleisig, Eve, Smith, Genevieve, Bossi, Madeline, Rustagi, Ishita, Yin, Xavier, Klein, Dan

arXiv.org Artificial IntelligenceJun-13-2024

We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-"standard" varieties from around the world). We prompted GPT-3.5 Turbo and GPT-4 with text by native speakers of each variety and analyzed the responses via detailed linguistic feature annotation and native speaker evaluation. We find that the models default to "standard" varieties of English; based on evaluation by native speakers, we also find that model responses to non-"standard" varieties consistently exhibit a range of issues: lack of comprehension (10% worse compared to "standard" varieties), stereotyping (16% worse), demeaning content (22% worse), and condescending responses (12% worse). We also find that if these models are asked to imitate the writing style of prompts in non-"standard" varieties, they produce text that exhibits lower comprehension of the input and is especially prone to stereotyping. GPT-4 improves on GPT-3.5 in terms of comprehension, warmth, and friendliness, but it also results in a marked increase in stereotyping (+17%). The results suggest that GPT-3.5 Turbo and GPT-4 exhibit linguistic discrimination in ways that can exacerbate harms for speakers of non-"standard" varieties.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2406.08818

Country:

North America > United States (1.00)
Asia (0.93)
Africa (0.93)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.68)

Industry:

Law (0.67)
Government (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels

Fleisig, Eve, Blodgett, Su Lin, Klein, Dan, Talat, Zeerak

arXiv.org Artificial IntelligenceMay-9-2024

Longstanding data labeling practices in machine learning involve collecting and aggregating labels from multiple annotators. But what should we do when annotators disagree? Though annotator disagreement has long been seen as a problem to minimize, new perspectivist approaches challenge this assumption by treating disagreement as a valuable source of information. In this position paper, we examine practices and assumptions surrounding the causes of disagreement--some challenged by perspectivist approaches, and some that remain to be addressed--as well as practical and normative challenges for work operating under these assumptions. We conclude with recommendations for the data labeling pipeline and avenues for future research engaging with subjectivity and disagreement.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.0586

Country:

Asia (0.93)
Europe > United Kingdom > England (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Mapping Social Choice Theory to RLHF

Dai, Jessica, Fleisig, Eve

arXiv.org Artificial IntelligenceApr-19-2024

Recent work on the limitations of using reinforcement learning from human feedback (RLHF) to incorporate human preferences into model behavior often raises social choice theory as a reference point. Social choice theory's analysis of settings such as voting mechanisms provides technical infrastructure that can inform how to aggregate human preferences amid disagreement. We analyze the problem settings of social choice and RLHF, identify key differences between them, and discuss how these differences may affect the RLHF interpretation of well-known technical results in social choice.

artificial intelligence, machine learning, voter, (13 more...)

arXiv.org Artificial Intelligence

2404.13038

Country:

Asia > Middle East (0.14)
North America > United States > California (0.14)
North America > Canada (0.14)

Genre: Research Report (0.50)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Centering the Margins: Outlier-Based Identification of Harmed Populations in Toxicity Detection

Raman, Vyoma, Fleisig, Eve, Klein, Dan

arXiv.org Artificial IntelligenceDec-1-2023

The impact of AI models on marginalized communities has traditionally been measured by identifying performance differences between specified demographic subgroups. Though this approach aims to center vulnerable groups, it risks obscuring patterns of harm faced by intersectional subgroups or shared across multiple groups. To address this, we draw on theories of marginalization from disability studies and related disciplines, which state that people farther from the norm face greater adversity, to consider the "margins" in the domain of toxicity detection. We operationalize the "margins" of a dataset by employing outlier detection to identify text about people with demographic attributes distant from the "norm". We find that model performance is consistently worse for demographic outliers, with mean squared error (MSE) between outliers and non-outliers up to 70.4% worse across toxicity types. It is also worse for text outliers, with a MSE up to 68.4% higher for outliers than non-outliers. We also find text and demographic outliers to be particularly susceptible to errors in the classification of severe toxicity and identity attacks. Compared to analysis of disparities using traditional demographic breakdowns, we find that our outlier analysis frequently surfaces greater harms faced by a larger, more intersectional group, which suggests that outlier analysis is particularly beneficial for identifying harms against those groups.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2305.14735

Country:

North America > United States (0.93)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Law (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.88)

Add feedback

When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks

Fleisig, Eve, Abebe, Rediet, Klein, Dan

arXiv.org Artificial IntelligenceNov-15-2023

Though majority vote among annotators is typically used for ground truth labels in natural language processing, annotator disagreement in tasks such as hate speech detection may reflect differences in opinion across groups, not noise. Thus, a crucial problem in hate speech detection is determining whether a statement is offensive to the demographic group that it targets, when that group may constitute a small fraction of the annotator pool. We construct a model that predicts individual annotator ratings on potentially offensive text and combines this information with the predicted target group of the text to model the opinions of target group members. We show gains across a range of metrics, including raising performance over the baseline by 22% at predicting individual annotators' ratings and by 33% at predicting variance among annotators, which provides a metric for model uncertainty downstream. We find that annotator ratings can be predicted using their demographic information and opinions on online content, without the need to track identifying annotator IDs that link each annotator to their ratings. We also find that use of non-invasive survey questions on annotators' online experiences helps to maximize privacy and minimize unnecessary collection of demographic information when predicting annotators' opinions.

annotator, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.06626

Country: North America > United States > Texas (0.14)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.49)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Incorporating Worker Perspectives into MTurk Annotation Practices for NLP

Huang, Olivia, Fleisig, Eve, Klein, Dan

arXiv.org Artificial IntelligenceNov-15-2023

Current practices regarding data collection for natural language processing on Amazon Mechanical Turk (MTurk) often rely on a combination of studies on data quality and heuristics shared among NLP researchers. However, without considering the perspectives of MTurk workers, these approaches are susceptible to issues regarding workers' rights and poor response quality. We conducted a critical literature review and a survey of MTurk workers aimed at addressing open questions regarding best practices for fair payment, worker privacy, data quality, and considering worker incentives. We found that worker preferences are often at odds with received wisdom among NLP researchers. Surveyed workers preferred reliable, reasonable payments over uncertain, very high payments; reported frequently lying on demographic questions; and expressed frustration at having work rejected with no explanation. We also found that workers view some quality control methods, such as requiring minimum response times or Master's qualifications, as biased and largely ineffective. Based on the survey results, we provide recommendations on how future NLP studies may better account for MTurk workers' experiences in order to respect workers' rights and improve data quality.

artificial intelligence, mturk worker, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.02802

Country:

North America > United States > Hawaii (0.14)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.35)

Add feedback

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

Verma, Vivek, Fleisig, Eve, Tomlin, Nicholas, Klein, Dan

arXiv.org Artificial IntelligenceNov-13-2023

We introduce Ghostbuster, a state-of-the-art system for detecting AI-generated text. Our method works by passing documents through a series of weaker language models, running a structured search over possible combinations of their features, and then training a classifier on the selected features to predict whether documents are AI-generated. Crucially, Ghostbuster does not require access to token probabilities from the target model, making it useful for detecting text generated by black-box models or unknown model versions. In conjunction with our model, we release three new datasets of human-and AI-generated text as detection benchmarks in the domains of student essays, creative writing, and news articles. We compare Ghostbuster to a variety of existing detectors, including DetectGPT and GPTZero, as well as a new RoBERTa baseline. Ghostbuster achieves 99.0 F1 when evaluated across domains, which is 5.9 F1 higher than the best preexisting model. It also outperforms all previous approaches in generalization across writing domains (+7.5 F1), prompting strategies (+2.1 F1), and language models (+4.4 F1). We also analyze the robustness of our system to a variety of perturbations and paraphrasing attacks and evaluate its performance on documents written by non-native English speakers. Language models such as ChatGPT are capable of producing a wide range of fluent text that closely approximates human language use. However, the proliferation of these models has raised concerns about the authenticity and trustworthiness of text across a variety of domains. For example, fears that students are submitting assignments ghostwritten by language models have led many schools to adapt by restricting the use of ChatGPT in classrooms and homework assignments (Heaven, 2023). Meanwhile, because language models are prone to factual errors and hallucination, readers may desire to know if such tools have been used to ghostwrite news articles or other informative text when deciding whether or not to trust a source.

ghostbuster, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2305.15047

Country:

Asia (0.28)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Education > Curriculum > Subject-Specific Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback