AITopics | reference group

Collaborating Authors

reference group

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Do covariates explain why these groups differ? The choice of reference group can reverse conclusions in the Oaxaca-Blinder decomposition

Quintero, Manuel, Shreekumar, Advik, Stephenson, William T., Broderick, Tamara

arXiv.org Machine LearningApr-1-2026

Scientists often want to explain why an outcome is different in two groups. For instance, differences in patient mortality rates across two hospitals could be due to differences in the patients themselves (covariates) or differences in medical care (outcomes given covariates). The Oaxaca--Blinder decomposition (OBD) is a standard tool to tease apart these factors. It is well known that the OBD requires choosing one of the groups as a reference, and the numerical answer can vary with the reference. To the best of our knowledge, there has not been a systematic investigation into whether the choice of OBD reference can yield different substantive conclusions and how common this issue is. In the present paper, we give existence proofs in real and simulated data that the OBD references can yield substantively different conclusions and that these differences are not entirely driven by model misspecification or small data. We prove that substantively different conclusions occur in up to half of the parameter space, but find these discrepancies rare in the real-data analyses we study. We explain this empirical rarity by examining how realistic data-generating processes can be biased towards parameters that do not change conclusions under the OBD.

artificial intelligence, covariate, decomposition, (17 more...)

arXiv.org Machine Learning

2603.29972

Country:

North America > Mexico > Oaxaca (0.26)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Michigan (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence (0.94)
Information Technology > Data Science (0.88)

Add feedback

Penalized Fair Regression for Multiple Groups in Chronic Kidney Disease

Nakamoto, Carter H., Chen, Lucia Lushi, Foryciarz, Agata, Rose, Sherri

arXiv.org Machine LearningDec-22-2025

Fair regression methods have the potential to mitigate societal bias concerns in health care, but there has been little work on penalized fair regression when multiple groups experience such bias. We propose a general regression framework that addresses this gap with unfairness penalties for multiple groups. Our approach is demonstrated for binary outcomes with true positive rate disparity penalties. It can be efficiently implemented through reduction to a cost-sensitive classification problem. We additionally introduce novel score functions for automatically selecting penalty weights. Our penalized fair regression methods are empirically studied in simulations, where they achieve a fairness-accuracy frontier beyond that of existing comparison methods. Finally, we apply these methods to a national multi-site primary care study of chronic kidney disease to develop a fair classifier for end-stage renal disease. There we find substantial improvements in fairness for multiple race and ethnicity groups who experience societal bias in the health care system without any appreciable loss in overall fit.

estimator, penalty weight, regression, (14 more...)

arXiv.org Machine Learning

2512.1734

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Alaska (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Empowering Clinical Trial Design through AI: A Randomized Evaluation of PowerGPT

Lu, Yiwen, Li, Lu, Zhang, Dazheng, Jian, Xinyao, Wang, Tingyin, Chen, Siqi, Lei, Yuqing, Tong, Jiayi, Xi, Zhaohan, Chu, Haitao, Luo, Chongliang, Ogdie, Alexis, Athey, Brian, Turan, Alparslan, Abramoff, Michael, Cappelleri, Joseph C, Xu, Hua, Lu, Yun, Berlin, Jesse, Sessler, Daniel I., Asch, David A., Jiang, Xiaoqian, Chen, Yong

arXiv.org Artificial IntelligenceSep-17-2025

Sample size calculations for power analysis are critical for clinical research and trial design, yet their complexity and reliance on statistical expertise create barriers for many researchers. We introduce PowerGPT, an AI-powered system integrating large language models (LLMs) with statistical engines to automate test selection and sample size estimation in trial design. In a randomized trial to evaluate its effectiveness, PowerGPT significantly improved task completion rates (99.3% vs. 88.9% for test selection, 99.3% vs. 77.8% for sample size calculation) and accuracy (94.1% vs. 55.4% in sample size estimation, p < 0.001), while reducing average completion time (4.0 vs. 9.3 minutes, p < 0.001). These gains were consistent across various statistical tests and benefited both statisticians and non-statisticians as well as bridging expertise gaps. Already under deployment across multiple institutions, PowerGPT represents a scalable AI-driven approach that enhances accessibility, efficiency, and accuracy in statistical power analysis for clinical research.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2509.12471

Country:

North America > United States > Pennsylvania (0.30)
North America > United States > Texas (0.28)
North America > United States > Iowa (0.28)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Discrimination by LLMs: Cross-lingual Bias Assessment and Mitigation in Decision-Making and Summarisation

Huijzer, Willem, Chen, Jieying

arXiv.org Artificial IntelligenceSep-15-2025

The rapid integration of Large Language Models (LLMs) into various domains raises concerns about societal inequalities and information bias. This study examines biases in LLMs related to background, gender, and age, with a focus on their impact on decision-making and summarization tasks. Additionally, the research examines the cross-lingual propagation of these biases and evaluates the effectiveness of prompt-instructed mitigation strategies. Using an adapted version of the dataset by Tamkin et al. (2023) translated into Dutch, we created 151,200 unique prompts for the decision task and 176,400 for the summarisation task. Various demographic variables, instructions, salience levels, and languages were tested on GPT-3.5 and GPT-4o. Our analysis revealed that both models were significantly biased during decision-making, favouring female gender, younger ages, and certain backgrounds such as the African-American background. In contrast, the summarisation task showed minimal evidence of bias, though significant age-related differences emerged for GPT-3.5 in English. Cross-lingual analysis showed that bias patterns were broadly similar between English and Dutch, though notable differences were observed across specific demographic categories. The newly proposed mitigation instructions, while unable to eliminate biases completely, demonstrated potential in reducing them. The most effective instruction achieved a 27\% mean reduction in the gap between the most and least favorable demographics. Notably, contrary to GPT-3.5, GPT-4o displayed reduced biases for all prompts in English, indicating the specific potential for prompt-based mitigation within newer models. This research underscores the importance of cautious adoption of LLMs and context-specific bias testing, highlighting the need for continued development of effective mitigation strategies to ensure responsible deployment of AI.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.09735

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Government (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Trust in Disinformation Narratives: a Trust in the News Experiment

Song, Hanbyul, Silva, Miguel F. Santos, Suau, Jaume, Espinosa-Anke, Luis

arXiv.org Artificial IntelligenceMar-14-2025

Understanding why people trust or distrust one another, institutions, or information is a complex task that has led scholars from various fields of study to employ diverse epistemological and methodological approaches. Despite the challenges, it is generally agreed that the antecedents of trust (and distrust) encompass a multitude of emotional and cognitive factors, including a general disposition to trust and an assessment of trustworthiness factors. In an era marked by increasing political polarization, cultural backlash, widespread disinformation and fake news, and the use of AI software to produce news content, the need to study trust in the news has gained significant traction. This study presents the findings of a trust in the news experiment designed in collaboration with Spanish and UK journalists, fact-checkers, and the CardiffNLP Natural Language Processing research group. The purpose of this experiment, conducted in June 2023, was to examine the extent to which people trust a set of fake news articles based on previously identified disinformation narratives related to gender, climate change, and COVID-19. The online experiment participants (801 in Spain and 800 in the UK) were asked to read three fake news items and rate their level of trust on a scale from 1 (not true) to 8 (true). The pieces used a combination of factors, including stance (favourable, neutral, or against the narrative), presence of toxic expressions, clickbait titles, and sources of information to test which elements influenced people's responses the most. Half of the pieces were produced by humans and the other half by ChatGPT. The results show that the topic of news articles, stance, people's age, gender, and political ideologies significantly affected their levels of trust in the news, while the authorship (humans or ChatGPT) does not have a significant impact.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.11116

Country:

Europe > Spain (0.25)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.52)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)

Add feedback

Finding Words Associated with DIF: Predicting Differential Item Functioning using LLMs and Explainable AI

Maeda, Hotaka, Lu, Yikai

arXiv.org Artificial IntelligenceFeb-10-2025

We fine-tuned and compared several encoder-based Transformer large language models (LLM) to predict differential item functioning (DIF) from the item text. We then applied explainable artificial intelligence (XAI) methods to these models to identify specific words associated with DIF. The data included 42,180 items designed for English language arts and mathematics summative state assessments among students in grades 3 to 11. Prediction $R^2$ ranged from .04 to .32 among eight focal and reference group pairs. Our findings suggest that many words associated with DIF reflect minor sub-domains included in the test blueprint by design, rather than construct-irrelevant item content that should be removed from assessments. This may explain why qualitative reviews of DIF items often yield confusing or inconclusive results. Our approach can be used to screen words associated with DIF during the item-writing process for immediate revision, or help review traditional DIF analysis results by highlighting key words in the text. Extensions of this research can enhance the fairness of assessment programs, especially those that lack resources to build high-quality items, and among smaller subpopulations where we do not have sufficient sample sizes for traditional DIF analyses.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.07017

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Navigating Fairness in Radiology AI: Concepts, Consequences,and Crucial Considerations

Venugopal, Vasantha Kumar, Gupta, Abhishek, Takhar, Rohit, Yee, Charlene Liew Jin, Jones, Catherine, Szarf, Gilberto

arXiv.org Artificial IntelligenceJun-2-2023

Artificial Intelligence (AI) has significantly revolutionized radiology, promising improved patient outcomes and streamlined processes. However, it's critical to ensure the fairness of AI models to prevent stealthy bias and disparities from leading to unequal outcomes. This review discusses the concept of fairness in AI, focusing on bias auditing using the Aequitas toolkit, and its real-world implications in radiology, particularly in disease screening scenarios. Aequitas, an open-source bias audit toolkit, scrutinizes AI models' decisions, identifying hidden biases that may result in disparities across different demographic groups and imaging equipment brands. This toolkit operates on statistical theories, analyzing a large dataset to reveal a model's fairness. It excels in its versatility to handle various variables simultaneously, especially in a field as diverse as radiology. The review explicates essential fairness metrics: Equal and Proportional Parity, False Positive Rate Parity, False Discovery Rate Parity, False Negative Rate Parity, and False Omission Rate Parity. Each metric serves unique purposes and offers different insights. We present hypothetical scenarios to demonstrate their relevance in disease screening settings, and how disparities can lead to significant real-world impacts.

artificial intelligence, fairness, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2306.01333

Country:

Oceania > Australia (0.14)
Europe > United Kingdom (0.14)
Asia > China (0.05)
(7 more...)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Auditing ICU Readmission Rates in an Clinical Database: An Analysis of Risk Factors and Clinical Outcomes

Raza, Shaina

arXiv.org Artificial IntelligenceApr-12-2023

This study presents a machine learning (ML) pipeline for clinical data classification in the context of a 30-day readmission problem, along with a fairness audit on subgroups based on sensitive attributes. A range of ML models are used for classification and the fairness audit is conducted on the model predictions. The fairness audit uncovers disparities in equal opportunity, predictive parity, false positive rate parity, and false negative rate parity criteria on the MIMIC III dataset based on attributes such as gender, ethnicity, language, and insurance group. The results identify disparities in the model's performance across different groups and highlights the need for better fairness and bias mitigation strategies. The study suggests the need for collaborative efforts among researchers, policymakers, and practitioners to address bias and fairness in artificial intelligence (AI) systems.

artificial intelligence, disparity, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2304.05986

Country:

North America > United States (0.29)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Government (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Fairness implications of encoding protected categorical attributes

Mougan, Carlos, Alvarez, Jose M., Patro, Gourab K, Ruggieri, Salvatore, Staab, Steffen

arXiv.org Machine LearningJan-27-2022

Protected attributes are often presented as categorical features that need to be encoded before feeding them into a machine learning algorithm. Encoding these attributes is paramount as they determine the way the algorithm will learn from the data. Categorical feature encoding has a direct impact on the model performance and fairness. In this work, we compare the accuracy and fairness implications of the two most well-known encoders: one-hot encoding and target encoding. We distinguish between two types of induced bias that can arise while using these encodings and can lead to unfair models. The first type, irreducible bias, is due to direct group category discrimination and a second type, reducible bias, is due to large variance in less statistically represented groups. We take a deeper look into how regularization methods for target encoding can improve the induced bias while encoding categorical features. Furthermore, we tackle the problem of intersectional fairness that arises when mixing two protected categorical features leading to higher cardinality. This practice is a powerful feature engineering technique used for boosting model performance. We study its implications on fairness as it can increase both types of induced bias

fairness, manuscript, regularization, (16 more...)

arXiv.org Machine Learning

2201.11358

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
(18 more...)

Genre: Research Report (0.66)

Industry:

Law (1.00)
Government > Regional Government (0.67)
Education > Curriculum > Subject-Specific Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Years Lived Alone And / Or Serial Break-UPS Strongly Linked to Inflammation in Men - Neuroscience News

#artificialintelligenceJan-11-2022, 01:05:50 GMT

Summary: Men who spend several years living alone or experience serial relationship breakups are at increased risk of inflammation, a new study reports. Living alone for several years and/or experiencing serial relationship break-ups are strongly linked to raised levels of inflammatory markers in the blood–but only in men–finds a large population study published online in the Journal of Epidemiology & Community Health. Although the inflammation was classified as low grade, it was persistent, and most likely indicates a heightened risk of age-related ill health and death, suggest the researchers. Divorce and committed relationship break-ups, which are often followed by a potentially lengthy period of living alone, have been associated with a heightened risk of poor physical and mental health, lowered immunity, and death. But most previously published studies have focused on the impact of one partnership dissolution, and then usually only on marital break-ups.

inflammation, inflammatory marker, relationship break-ups, (12 more...)

#artificialintelligence

Country: Europe > Denmark > Capital Region > Copenhagen (0.05)

Genre: Research Report > New Finding (0.37)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.50)
Health & Medicine > Therapeutic Area > Neurology (0.40)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.36)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.40)

Add feedback