AITopics | facct

Collaborating Authors

facct

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identities are not Interchangeable: The Problem of Overgeneralization in Fair Machine Learning

Wang, Angelina

arXiv.org Artificial IntelligenceSep-25-2025

A key value proposition of machine learning is generalizability: the same methods and model architecture should be able to work across different domains and different contexts. While powerful, this generalization can sometimes go too far, and miss the importance of the specifics. In this work, we look at how fair machine learning has often treated as interchangeable the identity axis along which discrimination occurs. In other words, racism is measured and mitigated the same way as sexism, as ableism, as ageism. Disciplines outside of computer science have pointed out both the similarities and differences between these different forms of oppression, and in this work we draw out the implications for fair machine learning. While certainly not all aspects of fair machine learning need to be tailored to the specific form of oppression, there is a pressing need for greater attention to such specificity than is currently evident. Ultimately, context specificity can deepen our understanding of how to build more fair systems, widen our scope to include currently overlooked harms, and, almost paradoxically, also help to narrow our scope and counter the fear of an infinite number of group-specific methods of analysis.

artificial intelligence, discrimination, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732033

2505.04038

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Law > Labor & Employment Law (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Information Technology (0.93)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)

Neumann, Anna, Kirsten, Elisabeth, Zafar, Muhammad Bilal, Singh, Jatinder

arXiv.org Artificial IntelligenceJun-24-2025

System prompts in Large Language Models (LLMs) are predefined directives that guide model behaviour, taking precedence over user inputs in text processing and generation. LLM deployers increasingly use them to ensure consistent responses across contexts. While model providers set a foundation of system prompts, deployers and third-party developers can append additional prompts without visibility into others' additions, while this layered implementation remains entirely hidden from end-users. As system prompts become more complex, they can directly or indirectly introduce unaccounted for side effects. This lack of transparency raises fundamental questions about how the position of information in different directives shapes model outputs. As such, this work examines how the placement of information affects model behaviour. To this end, we compare how models process demographic information in system versus user prompts across six commercially available LLMs and 50 demographic groups. Our analysis reveals significant biases, manifesting in differences in user representation and decision-making scenarios. Since these variations stem from inaccessible and opaque system-level configurations, they risk representational, allocative and potential other biases and downstream harms beyond the user's ability to detect or correct. Our findings draw attention to these critical issues, which have the potential to perpetuate harms if left unexamined. Further, we argue that system prompt analysis must be incorporated into AI auditing processes, particularly as customisable system prompts become increasingly prevalent in commercial AI deployments.

large language model, machine learning, system prompt, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732038

2505.21091

Country:

North America > United States (1.00)
Asia (0.92)
Europe > United Kingdom > England (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Consumer Health (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

"I Hadn't Thought About That": Creators of Human-like AI Weigh in on Ethics And Neurodivergence

Rizvi, Naba, Smith, Taggert, Vidyala, Tanvi, Bolds, Mya, Strickland, Harper, Begel, Andrew, Williams, Rua, Munyaka, Imani

arXiv.org Artificial IntelligenceJun-17-2025

Human-like AI agents such as robots and chatbots are becoming increasingly popular, but they present a variety of ethical concerns. The first concern is in how we define humanness, and how our definition impacts communities historically dehumanized by scientific research. Autistic people in particular have been dehumanized by being compared to robots, making it even more important to ensure this marginalization is not reproduced by AI that may promote neuronormative social behaviors. Second, the ubiquitous use of these agents raises concerns surrounding model biases and accessibility. In our work, we investigate the experiences of the people who build and design these technologies to gain insights into their understanding and acceptance of neurodivergence, and the challenges in making their work more accessible to users with diverse needs. Even though neurodivergent individuals are often marginalized for their unique communication styles, nearly all participants overlooked the conclusions their end-users and other AI system makers may draw about communication norms from the implementation and interpretation of humanness applied in participants' work. This highlights a major gap in their broader ethical considerations, compounded by some participants' neuronormative assumptions about the behaviors and traits that distinguish "humans" from "bots" and the replication of these assumptions in their work. We examine the impact this may have on autism inclusion in society and provide recommendations for additional systemic changes towards more ethical research directions.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732218

2506.12098

Country:

Europe (0.95)
Asia (0.67)
North America > United States > Indiana (0.28)
North America > United States > California > San Diego County (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Autism (0.76)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Issues (1.00)
(3 more...)

Add feedback

Reward Model Interpretability via Optimal and Pessimal Tokens

Christian, Brian, Kirk, Hannah Rose, Thompson, Jessica A. F., Summerfield, Christopher, Dumbalska, Tsvetomira

arXiv.org Artificial IntelligenceJun-10-2025

Reward modeling has emerged as a crucial component in aligning large language models with human values. Significant attention has focused on using reward models as a means for fine-tuning generative models. However, the reward models themselves -- which directly encode human value judgments by turning prompt-response pairs into scalar rewards -- remain relatively understudied. We present a novel approach to reward model interpretability through exhaustive analysis of their responses across their entire vocabulary space. By examining how different reward models score every possible single-token response to value-laden prompts, we uncover several striking findings: (i) substantial heterogeneity between models trained on similar objectives, (ii) systematic asymmetries in how models encode high- vs low-scoring tokens, (iii) significant sensitivity to prompt framing that mirrors human cognitive biases, and (iv) overvaluation of more frequent tokens. We demonstrate these effects across ten recent open-source reward models of varying parameter counts and architectures. Our results challenge assumptions about the interchangeability of reward models, as well as their suitability as proxies of complex and context-dependent human values. We find that these models can encode concerning biases toward certain identity groups, which may emerge as unintended consequences of harmlessness training -- distortions that risk propagating through the downstream large language models now deployed to millions.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732068

2506.07326

Country:

Europe > Switzerland (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Understanding Gender Bias in AI-Generated Product Descriptions

Kelly, Markelle, Tahaei, Mohammad, Smyth, Padhraic, Wilcox, Lauren

arXiv.org Artificial IntelligenceJun-9-2025

While gender bias in large language models (LLMs) has been extensively studied in many domains, uses of LLMs in e-commerce remain largely unexamined and may reveal novel forms of algorithmic bias and harm. Our work investigates this space, developing data-driven taxonomic categories of gender bias in the context of product description generation, which we situate with respect to existing general purpose harms taxonomies. We illustrate how AI-generated product descriptions can uniquely surface gender biases in ways that require specialized detection and mitigation approaches. Further, we quantitatively analyze issues corresponding to our taxonomic categories in two models used for this task -- GPT-3.5 and an e-commerce-specific LLM -- demonstrating that these forms of bias commonly occur in practice. Our results illuminate unique, under-explored dimensions of gender bias, such as assumptions about clothing size, stereotypical bias in which features of a product are advertised, and differences in the use of persuasive language. These insights contribute to our understanding of three types of AI harms identified by current frameworks: exclusionary norms, stereotyping, and performance disparities, particularly for the context of e-commerce.

category, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732169

2506.0539

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.93)
South America (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Retail (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services > e-Commerce Services (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Gender Trouble in Language Models: An Empirical Audit Guided by Gender Performativity Theory

Hafner, Franziska Sofia, Valdivia, Ana, Rocher, Luc

arXiv.org Artificial IntelligenceMay-21-2025

Language models encode and subsequently perpetuate harmful gendered stereotypes. Research has succeeded in mitigating some of these harms, e.g. by dissociating non-gendered terms such as occupations from gendered terms such as 'woman' and 'man'. This approach, however, remains superficial given that associations are only one form of prejudice through which gendered harms arise. Critical scholarship on gender, such as gender performativity theory, emphasizes how harms often arise from the construction of gender itself, such as conflating gender with biological sex. In language models, these issues could lead to the erasure of transgender and gender diverse identities and cause harms in downstream applications, from misgendering users to misdiagnosing patients based on wrong assumptions about their anatomy. For FAccT research on gendered harms to go beyond superficial linguistic associations, we advocate for a broader definition of 'gender bias' in language models. We operationalize insights on the construction of gender through language from gender studies literature and then empirically test how 16 language models of different architectures, training datasets, and model sizes encode gender. We find that language models tend to encode gender as a binary category tied to biological sex, and that gendered terms that do not neatly fall into one of these binary categories are erased and pathologized. Finally, we show that larger models, which achieve better results on performance benchmarks, learn stronger associations between gender and sex, further reinforcing a narrow understanding of gender. Our findings lead us to call for a re-evaluation of how gendered harms in language models are defined and addressed.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732112

2505.1408

Country:

North America > United States (0.69)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Evaluating Model Explanations without Ground Truth

Rawal, Kaivalya, Fu, Zihao, Delaney, Eoin, Russell, Chris

arXiv.org Artificial IntelligenceMay-16-2025

There can be many competing and contradictory explanations for a single model prediction, making it difficult to select which one to use. Current explanation evaluation frameworks measure quality by comparing against ideal "ground-truth" explanations, or by verifying model sensitivity to important inputs. We outline the limitations of these approaches, and propose three desirable principles to ground the future development of explanation evaluation strategies for local feature importance explanations. We propose a ground-truth Agnostic eXplanation Evaluation framework (AXE) for evaluating and comparing model explanations that satisfies these principles. Unlike prior approaches, AXE does not require access to ideal ground-truth explanations for comparison, or rely on model sensitivity - providing an independent measure of explanation quality. We verify AXE by comparing with baselines, and show how it can be used to detect explanation fairwashing. Our code is available at https://github.com/KaiRawal/Evaluating-Model-Explanations-without-Ground-Truth.

data mining, explanation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715275.3732219

2505.10399

Country:

North America > United States > California (0.46)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SHAP-based Explanations are Sensitive to Feature Representation

Hwang, Hyunseung, Bell, Andrew, Fonseca, Joao, Pliatsika, Venetia, Stoyanovich, Julia, Whang, Steven Euijong

arXiv.org Artificial IntelligenceMay-14-2025

Local feature-based explanations are a key component of the XAI toolkit. These explanations compute feature importance values relative to an ``interpretable'' feature representation. In tabular data, feature values themselves are often considered interpretable. This paper examines the impact of data engineering choices on local feature-based explanations. We demonstrate that simple, common data engineering techniques, such as representing age with a histogram or encoding race in a specific way, can manipulate feature importance as determined by popular methods like SHAP. Notably, the sensitivity of explanations to feature representation can be exploited by adversaries to obscure issues like discrimination. While the intuition behind these results is straightforward, their systematic exploration has been lacking. Previous work has focused on adversarial attacks on feature-based explainers by biasing data or manipulating models. To the best of our knowledge, this is the first study demonstrating that explainers can be misled by standard, seemingly innocuous data engineering techniques.

artificial intelligence, explanation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.08345

Country:

Europe (1.00)
North America > United States > New York (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

A Capabilities Approach to Studying Bias and Harm in Language Technologies

Nigatu, Hellina Hailu, Talat, Zeerak

arXiv.org Artificial IntelligenceNov-6-2024

In moving from excluding the majority of the world's languages to blindly adopting what we make for English, we first risk importing the same harms we have at best mitigated and at least measured for English. For instance, Yong et al. [15] showed how prompting GPT-4 in low-resource languages circumvents guardrails that are effective in English. However, in evaluating and mitigating harms arising from adopting new technologies into such contexts, we often disregard (1) the actual community needs of Language Technologies, and (2) biases and fairness issues within the context of the communities. Here, we consider fairness, bias, and inclusion in Language Technologies through the lens of the Capabilities Approach [12]. The Capabilities Approach centers what people are capable of achieving, given their intersectional social, political, and economic contexts instead of what resources are (theoretically) available to them. In the following sections, we detail the Capabilities Approach, its relationship to multilingual and multicultural evaluation, and how the framework affords meaningful collaboration with community members in defining and measuring harms of Language Technologies. 2 THE CAPABILITIES APPROACH The Capabilities Approach is a framework in developmental economic studies proposed by Amartya Sen in a series of articles published as far back as 1974 [1]. It has been applied to varied fields including environmental justice [e.g.

arxiv, capability approach, language technology, (13 more...)

arXiv.org Artificial Intelligence

2411.04298

Country:

Africa > Kenya (0.05)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
(2 more...)

Genre: Research Report (0.85)

Industry: Law (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)

Add feedback

Lazy Data Practices Harm Fairness Research

Simson, Jan, Fabris, Alessandro, Kern, Christoph

arXiv.org Machine LearningJun-18-2024

Data practices shape research and practice on fairness in machine learning (fair ML). Critical data studies offer important reflections and critiques for the responsible advancement of the field by highlighting shortcomings and proposing recommendations for improvement. In this work, we present a comprehensive analysis of fair ML datasets, demonstrating how unreflective yet common practices hinder the reach and reliability of algorithmic fairness findings. We systematically study protected information encoded in tabular datasets and their usage in 280 experiments across 142 publications. Our analyses identify three main areas of concern: (1) a \textbf{lack of representation for certain protected attributes} in both data and evaluations; (2) the widespread \textbf{exclusion of minorities} during data preprocessing; and (3) \textbf{opaque data processing} threatening the generalization of fairness research. By conducting exemplary analyses on the utilization of prominent datasets, we demonstrate how unreflective data decisions disproportionately affect minority groups, fairness metrics, and resultant model comparisons. Additionally, we identify supplementary factors such as limitations in publicly available data, privacy considerations, and a general lack of awareness, which exacerbate these challenges. To address these issues, we propose a set of recommendations for data usage in fairness research centered on transparency and responsible inclusion. This study underscores the need for a critical reevaluation of data practices in fair ML and offers directions to improve both the sourcing and usage of datasets.

dataset, fairness research, publication, (14 more...)

arXiv.org Machine Learning

doi: 10.1145/3630106.3658931

2404.17293

Country:

Europe > Austria > Vienna (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(20 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.92)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback