AITopics | group difference

Collaborating Authors

group difference

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Counterfactual Explanations for Deep Two-Sample Testing

Lai, Wei-Cheng, Simnacher, Marco, Lippert, Christoph

arXiv.org Machine LearningJun-12-2026

Two-sample testing is a fundamental tool for detecting distributional differences across scientific domains, but classical tests (including kernel-based tests) can be ineffective on high-dimensional structured data such as images. Recent deep two-sample tests improve sensitivity in these settings by learning informative representations, yet they provide limited insight into which data features drive rejection of the null hypothesis $H_0$. To address this issue, we propose a counterfactual explanation framework for deep two-sample testing that generates sample-level edits moving observations from a source group toward a target group while explicitly reducing the discrepancy measured by the test. Our method combines a diffusion autoencoder with a pretrained deep two-sample test model and optimizes a maximum mean discrepancy (MMD) objective in the test model's representation space to produce plausible counterfactuals. We quantify distribution-level effects through changes in the test statistic and the resulting two-sample p-values. We evaluate the method on synthetic 2D shape datasets and two MRI cohorts. Across both settings, the counterfactual transformations consistently increase p-values relative to the original samples, indicating that the edited source set becomes statistically closer to the target distribution under the test. We measure minimality using LPIPS to ensure the counterfactuals remain close to the original samples. The resulting edits provide interpretable evidence of the features associated with the detected group differences. On MRI, the localized changes are consistent with known anatomical differences between cohorts.

machine learning, natural language, urlhttp, (15 more...)

arXiv.org Machine Learning

2606.04009

Country:

North America (0.69)
Europe > Germany (0.49)

Genre: Research Report > Experimental Study (0.55)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

CanITrustMyFairnessMetric?AssessingFairness withUnlabeledDataandBayesianInference

Neural Information Processing SystemsFeb-19-2026, 07:25:13 GMT

We investigate the problem of reliably assessing group fairness when labeled examples are few but unlabeled examples are plentiful.

artificial intelligence, calibration, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

d83de59e10227072a9c034ce10029c39-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 17:00:18 GMT

accuracy, fairness, unlabeled data, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Orange County > Irvine (0.04)
North America > United States > Iowa (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Add feedback

Neural Responses to Affective Sentences Reveal Signatures of Depression

Kommineni, Aditya, Jeong, Woojae, Avramidis, Kleanthis, McDaniel, Colin, Hughes, Myzelle, McGee, Thomas, Kaiser, Elsi, Lerman, Kristina, Blank, Idan A., Byrd, Dani, Habibi, Assal, Cahn, B. Rael, Kadiri, Sudarsana, Medani, Takfarinas, Leahy, Richard M., Narayanan, Shrikanth

arXiv.org Artificial IntelligenceJun-9-2025

Depression is one of the most prevalent mental health disorders worldwide, with estimates indicating that around 5% of the worlds' adult population [1, 2] suffers from this condition. The primary methods for screening and monitoring depression rely on self-reported questionnaires, such as the Patient Health Questionnaire (PHQ-9) [3], Beck's Depression Inventory (BDI) [4] and Hamilton Depression Ratings Scale (HDRS) [5]. While these questionnaires are effective to varying degrees at screening patients for depression, they provide only limited information about the affected underlying neuro-cognitive processes in individuals, limiting the ability to personalize treatments. Given the heterogeneity of depressive symptomatology across patient populations [6, 7], it is crucial to elucidate the underlying neurophysiological mechanisms to support the development of more effective and individualized procedures for screening, monitoring, and treatment. Prior functional imaging studies have identified increased activity in anterior cin-gulate cortex (especially the subgenual anterior cingulate) during presentation of emotional stimuli, altered connectivity in prefrontal cortical areas, and default mode network as potential differentiating markers in depressed participants [8-13].

artificial intelligence, machine learning, participant, (16 more...)

arXiv.org Artificial Intelligence

2506.06244

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

Copula-Linked Parallel ICA: A Method for Coupling Structural and Functional MRI brain Networks

Agcaoglu, Oktay, Silva, Rogers F., Alacam, Deniz, Plis, Sergey, Adali, Tulay, Calhoun, Vince

arXiv.org Artificial IntelligenceNov-19-2024

Different brain imaging modalities offer unique insights into brain function and structure. Combining them enhances our understanding of neural mechanisms. Prior multimodal studies fusing functional MRI (fMRI) and structural MRI (sMRI) have shown the benefits of this approach. Since sMRI lacks temporal data, existing fusion methods often compress fMRI temporal information into summary measures, sacrificing rich temporal dynamics. Motivated by the observation that covarying networks are identified in both sMRI and resting-state fMRI, we developed a novel fusion method, by combining deep learning frameworks, copulas and independent component analysis (ICA), named copula linked parallel ICA (CLiP-ICA). This method estimates independent sources for each modality and links the spatial sources of fMRI and sMRI using a copula-based model for more flexible integration of temporal and spatial data. We tested CLiP-ICA using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Our results showed that CLiP-ICA effectively captures both strongly and weakly linked sMRI and fMRI networks, including the cerebellum, sensorimotor, visual, cognitive control, and default mode networks. It revealed more meaningful components and fewer artifacts, addressing the long-standing issue of optimal model order in ICA. CLiP-ICA also detected complex functional connectivity patterns across stages of cognitive decline, with cognitively normal subjects generally showing higher connectivity in sensorimotor and visual networks compared to patients with Alzheimer, along with patterns suggesting potential compensatory mechanisms.

clip-ica, connectivity, modality, (14 more...)

arXiv.org Artificial Intelligence

2410.19774

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Whither Bias Goes, I Will Go: An Integrative, Systematic Review of Algorithmic Bias Mitigation

Hickman, Louis, Huynh, Christopher, Gass, Jessica, Booth, Brandon, Kuruzovich, Jason, Tay, Louis

arXiv.org Artificial IntelligenceOct-31-2024

Machine learning (ML) models are increasingly used for personnel assessment and selection (e.g., resume screeners, automatically scored interviews). However, concerns have been raised throughout society that ML assessments may be biased and perpetuate or exacerbate inequality. Although organizational researchers have begun investigating ML assessments from traditional psychometric and legal perspectives, there is a need to understand, clarify, and integrate fairness operationalizations and algorithmic bias mitigation methods from the computer science, data science, and organizational research literatures. We present a four-stage model of developing ML assessments and applying bias mitigation methods, including 1) generating the training data, 2) training the model, 3) testing the model, and 4) deploying the model. When introducing the four-stage model, we describe potential sources of bias and unfairness at each stage. Then, we systematically review definitions and operationalizations of algorithmic bias, legal requirements governing personnel selection from the United States and Europe, and research on algorithmic bias mitigation across multiple domains and integrate these findings into our framework. Our review provides insights for both research and practice by elucidating possible mechanisms of algorithmic bias while identifying which bias mitigation methods are legal and effective. This integrative framework also reveals gaps in the knowledge of algorithmic bias mitigation that should be addressed by future collaborative research between organizational researchers, computer scientists, and data scientists. We provide recommendations for developing and deploying ML assessments, as well as recommendations for future research into algorithmic bias and fairness.

adverse impact, group difference, training data, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1037/apl0001255

2410.19003

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Law > Labor & Employment Law (0.93)
Law > Civil Rights & Constitutional Law (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Add feedback

To which reference class do you belong? Measuring racial fairness of reference classes with normative modeling

Rutherford, Saige, Wolfers, Thomas, Fraza, Charlotte, Harrnet, Nathaniel G., Beckmann, Christian F., Ruhe, Henricus G., Marquand, Andre F.

arXiv.org Artificial IntelligenceJul-26-2024

Reference classes in healthcare establish healthy norms, such as pediatric growth charts of height and weight, and are used to chart deviations from these norms which represent potential clinical risk. How the demographics of the reference class influence clinical interpretation of deviations is unknown. Using normative modeling, a method for building reference classes, we evaluate the fairness (racial bias) in reference models of structural brain images that are widely used in psychiatry and neurology. We test whether including race in the model creates fairer models. We predict self-reported race using the deviation scores from three different reference class normative models, to better understand bias in an integrated, multivariate sense. Across all of these tasks, we uncover racial disparities that are not easily addressed with existing data or commonly used modeling techniques. Our work suggests that deviations from the norm could be due to demographic mismatch with the reference class, and assigning clinical meaning to these deviations should be done with caution. Our approach also suggests that acquiring more representative samples is an urgent research priority.

fairness, normative model, reference class, (13 more...)

arXiv.org Artificial Intelligence

2407.19114

Country:

Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

A Demographic-Conditioned Variational Autoencoder for fMRI Distribution Sampling and Removal of Confounds

Orlichenko, Anton, Qu, Gang, Zhou, Ziyu, Liu, Anqi, Deng, Hong-Wen, Ding, Zhengming, Stephen, Julia M., Wilson, Tony W., Calhoun, Vince D., Wang, Yu-Ping

arXiv.org Artificial IntelligenceMay-13-2024

Objective: fMRI and derived measures such as functional connectivity (FC) have been used to predict brain age, general fluid intelligence, psychiatric disease status, and preclinical neurodegenerative disease. However, it is not always clear that all demographic confounds, such as age, sex, and race, have been removed from fMRI data. Additionally, many fMRI datasets are restricted to authorized researchers, making dissemination of these valuable data sources challenging. Methods: We create a variational autoencoder (VAE)-based model, DemoVAE, to decorrelate fMRI features from demographics and generate high-quality synthetic fMRI data based on user-supplied demographics. We train and validate our model using two large, widely used datasets, the Philadelphia Neurodevelopmental Cohort (PNC) and Bipolar and Schizophrenia Network for Intermediate Phenotypes (BSNIP). Results: We find that DemoVAE recapitulates group differences in fMRI data while capturing the full breadth of individual variations. Significantly, we also find that most clinical and computerized battery fields that are correlated with fMRI data are not correlated with DemoVAE latents. An exception are several fields related to schizophrenia medication and symptom severity. Conclusion: Our model generates fMRI data that captures the full distribution of FC better than traditional VAE or GAN models. We also find that most prediction using fMRI data is dependent on correlation with, and prediction of, demographics. Significance: Our DemoVAE model allows for generation of high quality synthetic data conditioned on subject demographics as well as the removal of the confounding effects of demographics. We identify that FC-based prediction tasks are highly influenced by demographic confounds.

connectivity, demovae, fmri data, (14 more...)

arXiv.org Artificial Intelligence

2405.07977

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination Won Hwa Kim Charles Hatt

Neural Information Processing SystemsMar-14-2024, 13:47:15 GMT

Hypothesis testing on signals defined on surfaces (such as the cortical surface) is a fundamental component of a variety of studies in Neuroscience. The goal here is to identify regions that exhibit changes as a function of the clinical condition under study. As the clinical questions of interest move towards identifying very early signs of diseases, the corresponding statistical differences at the group level invariably become weaker and increasingly hard to identify. Indeed, after a multiple comparisons correction is adopted (to account for correlated statistical tests over all surface points), very few regions may survive. In contrast to hypothesis tests on point-wise measurements, in this paper, we make the case for performing statistical analysis on multi-scale shape descriptors that characterize the local topological context of the signal around each surface vertex. Our descriptors are based on recent results from harmonic analysis, that show how wavelet theory extends to non-Euclidean settings (i.e., irregular weighted graphs). We provide strong evidence that these descriptors successfully pick up group-wise differences, where traditional methods either fail or yield unsatisfactory results. Other than this primary application, we show how the framework allows performing cortical surface smoothing in the native space without mappint to a unit sphere.

cortical surface, cortical thickness, vertex, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.15)

Genre: Research Report (0.95)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Aligning with Whom? Large Language Models Have Gender and Racial Biases in Subjective NLP Tasks

Sun, Huaman, Pei, Jiaxin, Choi, Minje, Jurgens, David

arXiv.org Artificial IntelligenceNov-16-2023

Human perception of language depends on personal backgrounds like gender and ethnicity. While existing studies have shown that large language models (LLMs) hold values that are closer to certain societal groups, it is unclear whether their prediction behaviors on subjective NLP tasks also exhibit a similar bias. In this study, leveraging the POPQUORN dataset which contains annotations of diverse demographic backgrounds, we conduct a series of experiments on four popular LLMs to investigate their capability to understand group differences and potential biases in their predictions for politeness and offensiveness. We find that for both tasks, model predictions are closer to the labels from White and female participants. We further explore prompting with the target demographic labels and show that including the target demographic in the prompt actually worsens the model's performance. More specifically, when being prompted to respond from the perspective of "Black" and "Asian" individuals, models show lower performance in predicting both overall scores as well as the scores from corresponding groups. Our results suggest that LLMs hold gender and racial biases for subjective NLP tasks and that demographic-infused prompts alone may be insufficient to mitigate such effects. Code and data are available at https://github.com/Jiaxin-Pei/LLM-Group-Bias.

language model, llm, prediction, (14 more...)

arXiv.org Artificial Intelligence

2311.0973

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback