AITopics | underrepresented group

Collaborating Authors

underrepresented group

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Differential Privacy Has Disparate Impact on Model Accuracy

Eugene Bagdasaryan, Omid Poursaeed, Vitaly Shmatikov

Neural Information Processing SystemsFeb-15-2026, 07:41:02 GMT

Neural Information Processing Systems http://nips.cc/

accuracy, learning, underrepresented class, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Add feedback

Differential Privacy Has Disparate Impact on Model Accuracy

Eugene Bagdasaryan, Omid Poursaeed, Vitaly Shmatikov

Neural Information Processing SystemsAug-20-2025, 10:51:10 GMT

The cost of differential privacy is a reduction in the model's accuracy.

accuracy, learning, underrepresented class, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(2 more...)

Add feedback

Emergence of Hierarchical Emotion Organization in Large Language Models

Zhao, Bo, Okawa, Maya, Bigelow, Eric J., Yu, Rose, Ullman, Tomer, Lubana, Ekdeep Singh, Tanaka, Hidenori

arXiv.org Artificial IntelligenceJul-16-2025

As large language models (LLMs) increasingly power conversational agents, understanding how they model users' emotional states is critical for ethical deployment. Inspired by emotion wheels -- a psychological framework that argues emotions organize hierarchically -- we analyze probabilistic dependencies between emotional states in model outputs. We find that LLMs naturally form hierarchical emotion trees that align with human psychological models, and larger models develop more complex hierarchies. We also uncover systematic biases in emotion recognition across socioeconomic personas, with compounding misclassifications for intersectional, underrepresented groups. Human studies reveal striking parallels, suggesting that LLMs internalize aspects of social perception. Beyond highlighting emergent emotional reasoning in LLMs, our results hint at the potential of using cognitively-grounded theories for developing better model evaluations.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.10599

Country:

North America > United States > California (0.28)
North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.68)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Improving Equity in Health Modeling with GPT4-Turbo Generated Synthetic Data: A Comparative Study

Smolyak, Daniel, Welivita, Arshana, Bjarnadóttir, Margrét V., Agarwal, Ritu

arXiv.org Artificial IntelligenceDec-20-2024

Objective. Demographic groups are often represented at different rates in medical datasets. These differences can create bias in machine learning algorithms, with higher levels of performance for better-represented groups. One promising solution to this problem is to generate synthetic data to mitigate potential adverse effects of non-representative data sets. Methods. We build on recent advances in LLM-based synthetic data generation to create a pipeline where the synthetic data is generated separately for each demographic group. We conduct our study using MIMIC-IV and Framingham "Offspring and OMNI-1 Cohorts" datasets. We prompt GPT4-Turbo to create group-specific data, providing training examples and the dataset context. An exploratory analysis is conducted to ascertain the quality of the generated data. We then evaluate the utility of the synthetic data for augmentation of a training dataset in a downstream machine learning task, focusing specifically on model performance metrics across groups. Results. The performance of GPT4-Turbo augmentation is generally superior but not always. In the majority of experiments our method outperforms standard modeling baselines, however, prompting GPT-4-Turbo to produce data specific to a group provides little to no additional benefit over a prompt that does not specify the group. Conclusion. We developed a method for using LLMs out-of-the-box to synthesize group-specific data to address imbalances in demographic representation in medical datasets. As another "tool in the toolbox", this method can improve model fairness and thus health equity. More research is needed to understand the conditions under which LLM generated synthetic data is useful for non-representative medical data sets.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2412.16335

Country: North America > United States > Maryland (0.46)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Debiasing Cardiac Imaging with Controlled Latent Diffusion Models

Skorupko, Grzegorz, Osuala, Richard, Szafranowska, Zuzanna, Kushibar, Kaisar, Aung, Nay, Petersen, Steffen E, Lekadir, Karim, Gkontra, Polyxeni

arXiv.org Artificial IntelligenceMar-28-2024

The progress in deep learning solutions for disease diagnosis and prognosis based on cardiac magnetic resonance imaging is hindered by highly imbalanced and biased training data. To address this issue, we propose a method to alleviate imbalances inherent in datasets through the generation of synthetic data based on sensitive attributes such as sex, age, body mass index, and health condition. We adopt ControlNet based on a denoising diffusion probabilistic model to condition on text assembled from patient metadata and cardiac geometry derived from segmentation masks using a large-cohort study, specifically, the UK Biobank. We assess our method by evaluating the realism of the generated images using established quantitative metrics. Furthermore, we conduct a downstream classification task aimed at debiasing a classifier by rectifying imbalances within underrepresented groups through synthetically generated samples. Our experiments demonstrate the effectiveness of the proposed approach in mitigating dataset imbalances, such as the scarcity of younger patients or individuals with normal BMI level suffering from heart failure. This work represents a major step towards the adoption of synthetic data for the development of fair and generalizable models for medical classification tasks. Notably, we conduct all our experiments using a single, consumer-level GPU to highlight the feasibility of our approach within resource-constrained environments.

dataset, debiasing cardiac imaging, diffusion model, (12 more...)

arXiv.org Artificial Intelligence

2403.19508

Country:

Europe > Switzerland (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Who Are We Missing? A Principled Approach to Characterizing the Underrepresented Population

Parikh, Harsh, Ross, Rachael, Stuart, Elizabeth, Rudolph, Kara

arXiv.org Artificial IntelligenceJan-25-2024

Randomized controlled trials (RCTs) serve as the cornerstone for understanding causal effects, yet extending inferences to target populations presents challenges due to effect heterogeneity and underrepresentation. Our paper addresses the critical issue of identifying and characterizing underrepresented subgroups in RCTs, proposing a novel framework for refining target populations to improve generalizability. We introduce an optimization-based approach, Rashomon Set of Optimal Trees (ROOT), to characterize underrepresented groups. ROOT optimizes the target subpopulation distribution by minimizing the variance of the target average treatment effect estimate, ensuring more precise treatment effect estimations. Notably, ROOT generates interpretable characteristics of the underrepresented population, aiding researchers in effective communication. Our approach demonstrates improved precision and interpretability compared to alternatives, as illustrated with synthetic data experiments. We apply our methodology to extend inferences from the Starting Treatment with Agonist Replacement Therapies (START) trial -- investigating the effectiveness of medication for opioid use disorder -- to the real-world population represented by the Treatment Episode Dataset: Admissions (TEDS-A). By refining target populations using ROOT, our framework offers a systematic approach to enhance decision-making accuracy and inform future trials in diverse populations.

covariate, target population, treatment effect, (16 more...)

arXiv.org Artificial Intelligence

2401.14512

Country:

North America > United States > South Carolina (0.04)
North America > United States > Oregon (0.04)
North America > United States > District of Columbia (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Interactive robots as inclusive tools to increase diversity in higher education

Holthaus, Patrick

arXiv.org Artificial IntelligenceMar-2-2023

There is a major lack of diversity in engineering, technology, and computing subjects in higher education. The resulting underrepresentation of some population groups contributes largely to gender and ethnicity pay gaps and social disadvantages. We aim to increase the diversity among students in such subjects by investigating the use of interactive robots as a tool that can get prospective students from different backgrounds interested in robotics as their field of study. For that, we will survey existing solutions that have proven to be successful in engaging underrepresented groups with technical subjects in educational settings. Moreover, we examine two recent outreach events at the University of Hertfordshire against inclusivity criteria. Based on that, we suggest specific activities for higher education institutions that follow an inclusive approach using interactive robots to attract prospective students at open days and other outreach events. Our suggestions provide tangible actions that can be easily implemented by higher education institutions to make technical subjects more appealing to everyone and thereby tackle inequalities in student uptake.

artificial intelligence, robot, student, (17 more...)

arXiv.org Artificial Intelligence

2303.01316

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
Europe > United Kingdom > England > Hertfordshire > Hatfield (0.04)
Europe > Norway (0.04)
(3 more...)

Genre:

Research Report (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting > Higher Education (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play

Liu, Jeremiah Zhe, Dvijotham, Krishnamurthy Dj, Lee, Jihyeon, Yuan, Quan, Strobel, Martin, Lakshminarayanan, Balaji, Ramachandran, Deepak

arXiv.org Artificial IntelligenceFeb-11-2023

Standard empirical risk minimization (ERM) training can produce deep neural network (DNN) models that are accurate on average but under-perform in under-represented population subgroups, especially when there are imbalanced group distributions in the long-tailed training data. Therefore, approaches that improve the accuracy-group robustness trade-off frontier of a DNN model (i.e. improving worst-group accuracy without sacrificing average accuracy, or vice versa) is of crucial importance. Uncertainty-based active learning (AL) can potentially improve the frontier by preferentially sampling underrepresented subgroups to create a more balanced training dataset. However, the quality of uncertainty estimates from modern DNNs tend to degrade in the presence of spurious correlations and dataset bias, compromising the effectiveness of AL for sampling tail groups. In this work, we propose Introspective Self-play (ISP), a simple approach to improve the uncertainty estimation of a deep neural network under dataset bias, by adding an auxiliary introspection task requiring a model to predict the bias for each data point in addition to the label. We show that ISP provably improves the bias-awareness of the model representation and the resulting uncertainty estimates. On two real-world tabular and language tasks, ISP serves as a simple "plug-in" for AL model training, consistently improving both the tail-group sampling rate and the final accuracy-fairness trade-off frontier of popular AL methods.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.05807

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

Why Technology Alone Can't Solve AI's Bias Problem - HBS Working Knowledge

#artificialintelligenceJan-4-2023, 13:35:27 GMT

In a cluttered online world, few can resist the convenience of an automated ranking when deciding what movie to watch on Netflix or which seafood restaurant looks promising in a Google search. But when it comes to finding a job candidate or someone to do a basic household task, there's often a human toll to letting algorithms do the work. Searches on popular recruiting sites might seem like a neutral way to find prospective candidates, but their underlying technology can reinforce biases by excluding underrepresented groups, including women. For instance, research shows that women receive fewer employment reviews on the popular online freelancing site TaskRabbit compared to men with the same experience--and this lack of reviews can lower the rankings of women in talent search algorithms. "Maybe there is a bias from people who have been traditionally hiring men," explains Himabindu Lakkaraju, an assistant professor at Harvard Business School.

artificial intelligence, information management, information technology services, (17 more...)

#artificialintelligence

Industry: Information Technology > Services (0.55)

Technology:

Information Technology > Information Management > Search (0.70)
Information Technology > Artificial Intelligence (0.50)

Add feedback

MIT Schwarzman College of Computing unveils Break Through Tech AI

#artificialintelligenceJul-28-2022, 18:20:12 GMT

Aimed at driving diversity and inclusion in artificial intelligence, the MIT Stephen A. Schwarzman College of Computing is launching Break Through Tech AI, a new program to bridge the talent gap for women and underrepresented genders in AI positions in industry. Break Through Tech AI will provide skills-based training, industry-relevant portfolios, and mentoring to qualified undergraduate students in the Greater Boston area in order to position them more competitively for careers in data science, machine learning, and artificial intelligence. The free, 18-month program will also provide each student with a stipend for participation to lower the barrier for those typically unable to engage in an unpaid, extra-curricular educational opportunity. "Helping position students from diverse backgrounds to succeed in fields such as data science, machine learning, and artificial intelligence is critical for our society's future," says Daniel Huttenlocher, dean of the MIT Schwarzman College of Computing and Henry Ellis Warren Professor of Electrical Engineering and Computer Science. "We look forward to working with students from across the Greater Boston area to provide them with skills and mentorship to help them find careers in this competitive and growing industry."

mit schwarzman college, schwarzman college, tech ai, (10 more...)

#artificialintelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.40)
North America > United States > California > Los Angeles County > Los Angeles (0.07)
North America > United States > New York (0.05)

Genre: Instructional Material (0.52)

Industry: Education > Educational Setting > Higher Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback