potential bias
Identifying Bias in Machine-generated Text Detection
Stowe, Kevin, Afanaseva, Svetlana, Raimundo, Rodolfo, Sun, Yitao, Patil, Kailash
The meteoric rise in text generation capability has been accompanied by parallel growth in interest in machine-generated text detection: the capability to identify whether a given text was generated using a model or written by a person. While detection models show strong performance, they have the capacity to cause significant negative impacts. We explore potential biases in English machine-generated text detection systems. We curate a dataset of student essays and assess 16 different detection systems for bias across four attributes: gender, race/ethnicity, English-language learner (ELL) status, and economic status. We evaluate these attributes using regression-based models to determine the significance and power of the effects, as well as performing subgroup analysis. We find that while biases are generally inconsistent across systems, there are several key issues: several models tend to classify disadvantaged groups as machine-generated, ELL essays are more likely to be classified as machine-generated, economically disadvantaged students' essays are less likely to be classified as machine-generated, and non-White ELL essays are disproportionately classified as machine-generated relative to their White counterparts. Finally, we perform human annotation and find that while humans perform generally poorly at the detection task, they show no significant biases on the studied attributes.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.93)
Evaluating the Bias in LLMs for Surveying Opinion and Decision Making in Healthcare
Khaokaew, Yonchanok, Salim, Flora D., Züfle, Andreas, Xue, Hao, Anderson, Taylor, MacIntyre, C. Raina, Scotch, Matthew, Heslop, David J
Generative agents have been increasingly used to simulate human behaviour in silico, driven by large language models (LLMs). These simulacra serve as sandboxes for studying human behaviour without compromising privacy or safety. However, it remains unclear whether such agents can truly represent real individuals. This work compares survey data from the Understanding America Study (UAS) on healthcare decision-making with simulated responses from generative agents. Using demographic-based prompt engineering, we create digital twins of survey respondents and analyse how well different LLMs reproduce real-world behaviours. Our findings show that some LLMs fail to reflect realistic decision-making, such as predicting universal vaccine acceptance. However, Llama 3 captures variations across race and Income more accurately but also introduces biases not present in the UAS data. This study highlights the potential of generative agents for behavioural research while underscoring the risks of bias from both LLMs and prompting strategies.
- North America > United States > California (0.14)
- Oceania > Australia > New South Wales (0.04)
- North America > United States > Arizona (0.04)
- Asia > China (0.04)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Health & Medicine > Therapeutic Area > Vaccines (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
Value Profiles for Encoding Human Variation
Sorensen, Taylor, Mishra, Pushkar, Patel, Roma, Tessler, Michael Henry, Bakker, Michiel, Evans, Georgina, Gabriel, Iason, Goodman, Noah, Rieser, Verena
Modelling human variation in rating tasks is crucial for enabling AI systems for personalization, pluralistic model alignment, and computational social science. We propose representing individuals using value profiles -- natural language descriptions of underlying values compressed from in-context demonstrations -- along with a steerable decoder model to estimate ratings conditioned on a value profile or other rater information. To measure the predictive information in rater representations, we introduce an information-theoretic methodology. We find that demonstrations contain the most information, followed by value profiles and then demographics. However, value profiles offer advantages in terms of scrutability, interpretability, and steerability due to their compressed natural language format. Value profiles effectively compress the useful information from demonstrations (>70% information preservation). Furthermore, clustering value profiles to identify similarly behaving individuals better explains rater variation than the most predictive demographic groupings. Going beyond test set performance, we show that the decoder models interpretably change ratings according to semantic profile differences, are well-calibrated, and can help explain instance-level disagreement by simulating an annotator population. These results demonstrate that value profiles offer novel, predictive ways to describe individual variation beyond demographics or group information.
- North America > United States > Washington > King County > Seattle (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (12 more...)
- Social Sector (1.00)
- Law > Statutes (1.00)
- Law > Criminal Law (1.00)
- (12 more...)
Enhancing Phishing Detection through Feature Importance Analysis and Explainable AI: A Comparative Study of CatBoost, XGBoost, and EBM Models
Fajar, Abdullah, Yazid, Setiadi, Budi, Indra
Phishing attacks remain a persistent threat to online security, demanding robust detection methods. This study investigates the use of machine learning to identify phishing URLs, emphasizing the crucial role of feature selection and model interpretability for improved performance. Employing Recursive Feature Elimination, the research pinpointed key features like "length_url," "time_domain_activation" and "Page_rank" as strong indicators of phishing attempts. The study evaluated various algorithms, including CatBoost, XGBoost, and Explainable Boosting Machine, assessing their robustness and scalability. XGBoost emerged as highly efficient in terms of runtime, making it well-suited for large datasets. CatBoost, on the other hand, demonstrated resilience by maintaining high accuracy even with reduced features. To enhance transparency and trustworthiness, Explainable AI techniques, such as SHAP, were employed to provide insights into feature importance. The study's findings highlight that effective feature selection and model interpretability can significantly bolster phishing detection systems, paving the way for more efficient and adaptable defenses against evolving cyber threats
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.85)
- (3 more...)
Revealing Hidden Bias in AI: Lessons from Large Language Models
Beatty, Django, Masanthia, Kritsada, Kaphol, Teepakorn, Sethi, Niphan
As large language models (LLMs) become integral to recruitment processes, concerns about AI-induced bias have intensified. This study examines biases in candidate interview reports generated by Claude 3.5 Sonnet, GPT-4o, Gemini 1.5, and Llama 3.1 405B, focusing on characteristics such as gender, race, and age. We evaluate the effectiveness of LLM-based anonymization in reducing these biases. Findings indicate that while anonymization reduces certain biases, particularly gender bias, the degree of effectiveness varies across models and bias types. Notably, Llama 3.1 405B exhibited the lowest overall bias. Moreover, our methodology of comparing anonymized and non-anonymized data reveals a novel approach to assessing inherent biases in LLMs beyond recruitment applications. This study underscores the importance of careful LLM selection and suggests best practices for minimizing bias in AI applications, promoting fairness and inclusivity.
- North America > United States > California (0.04)
- Asia > Thailand (0.04)
- Asia > Bangladesh (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
Say My Name: a Model's Bias Discovery Framework
Ciranni, Massimiliano, Molinaro, Luca, Barbano, Carlo Alberto, Fiandrotti, Attilio, Murino, Vittorio, Pastore, Vito Paolo, Tartaglione, Enzo
In the last few years, due to the broad applicability of deep learning to downstream tasks and end-to-end training capabilities, increasingly more concerns about potential biases to specific, non-representative patterns have been raised. Many works focusing on unsupervised debiasing usually leverage the tendency of deep models to learn ``easier'' samples, for example by clustering the latent space to obtain bias pseudo-labels. However, the interpretation of such pseudo-labels is not trivial, especially for a non-expert end user, as it does not provide semantic information about the bias features. To address this issue, we introduce ``Say My Name'' (SaMyNa), the first tool to identify biases within deep models semantically. Unlike existing methods, our approach focuses on biases learned by the model. Our text-based pipeline enhances explainability and supports debiasing efforts: applicable during either training or post-hoc validation, our method can disentangle task-related information and proposes itself as a tool to analyze biases. Evaluation on traditional benchmarks demonstrates its effectiveness in detecting biases and even disclaiming them, showcasing its broad applicability for model diagnosis.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- Europe > Italy > Liguria > Genoa (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Information Technology > Security & Privacy (0.68)
- Law (0.68)
- Transportation > Ground > Road (0.46)
ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning
Boychev, Delyan, Cholakov, Radostin
Generative models, such as diffusion models (DMs), variational autoencoders (VAEs), and generative adversarial networks (GANs), produce images with a level of authenticity that makes them nearly indistinguishable from real photos and artwork. While this capability is beneficial for many industries, the difficulty of identifying synthetic images leaves online media platforms vulnerable to impersonation and misinformation attempts. To support the development of defensive methods, we introduce ImagiNet, a high-resolution and balanced dataset for synthetic image detection, designed to mitigate potential biases in existing resources. It contains 200K examples, spanning four content categories: photos, paintings, faces, and uncategorized. Synthetic images are produced with open-source and proprietary generators, whereas real counterparts of the same content type are collected from public datasets. The structure of ImagiNet allows for a two-track evaluation system: i) classification as real or synthetic and ii) identification of the generative model. To establish a baseline, we train a ResNet-50 model using a self-supervised contrastive objective (SelfCon) for each track. The model demonstrates state-of-the-art performance and high inference speed across established benchmarks, achieving an AUC of up to 0.99 and balanced accuracy ranging from 86% to 95%, even under social network conditions that involve compression and resizing.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Bulgaria > Plovdiv Province > Plovdiv (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- Information Technology > Services (0.34)
- Media > News (0.34)
Reducing Biases towards Minoritized Populations in Medical Curricular Content via Artificial Intelligence for Fairer Health Outcomes
Salavati, Chiman, Song, Shannon, Diaz, Willmar Sosa, Hale, Scott A., Montenegro, Roberto E., Murai, Fabricio, Dori-Hacohen, Shiri
Biased information (recently termed bisinformation) continues to be taught in medical curricula, often long after having been debunked. In this paper, we introduce BRICC, a firstin-class initiative that seeks to mitigate medical bisinformation using machine learning to systematically identify and flag text with potential biases, for subsequent review in an expert-in-the-loop fashion, thus greatly accelerating an otherwise labor-intensive process. A gold-standard BRICC dataset was developed throughout several years, and contains over 12K pages of instructional materials. Medical experts meticulously annotated these documents for bias according to comprehensive coding guidelines, emphasizing gender, sex, age, geography, ethnicity, and race. Using this labeled dataset, we trained, validated, and tested medical bias classifiers. We test three classifier approaches: a binary type-specific classifier, a general bias classifier; an ensemble combining bias type-specific classifiers independently-trained; and a multitask learning (MTL) model tasked with predicting both general and type-specific biases. While MTL led to some improvement on race bias detection in terms of F1-score, it did not outperform binary classifiers trained specifically on each task. On general bias detection, the binary classifier achieves up to 0.923 of AUC, a 27.8% improvement over the baseline. This work lays the foundations for debiasing medical curricula by exploring a novel dataset and evaluating different training model strategies. Hence, it offers new pathways for more nuanced and effective mitigation of bisinformation.
- Europe > Montenegro (0.04)
- South America > Brazil (0.04)
- North America > United States > North Carolina (0.04)
- (2 more...)
- Instructional Material > Course Syllabus & Notes (0.48)
- Instructional Material > Online (0.34)
Battling Bias: AI's Fight for Fairness
As artificial intelligence (AI) continues to play a significant role in various industries and aspects of daily life, the issue of bias in AI algorithms has become increasingly prevalent. Biased AI systems can perpetuate existing social inequalities and lead to unfair treatment, creating a critical need for addressing and mitigating discrimination in machine learning applications. Bias in AI can originate from several sources, such as biased training data, lack of diversity in AI development teams, and biased algorithms themselves. Training data is the foundation of any AI system, and if the data used to train an algorithm contains biases, those biases will be passed on to the AI system. For example, biased facial recognition systems have been found to misidentify people of color at a higher rate than white individuals, leading to wrongful arrests and other consequences.
- Law (0.73)
- Information Technology > Security & Privacy (0.32)
What is Machine Learning and Artificial Intelligence? - The Enlightened Mindset
Machine Learning (ML) and Artificial Intelligence (AI) are two of the most popular and rapidly developing technologies used in many industries today. They have been around for decades, but their importance has grown exponentially in the last few years due to advances in technology and the increasing need for automation and data analysis. In this article, we will explore what ML and AI are, how they are different, and how they can be used in various fields. Before we dive into the specifics of ML and AI, it is important to understand what they are. Machine Learning is a type of artificial intelligence that enables computer systems to learn from data and make decisions without being explicitly programmed.