AITopics | representation bias

Collaborating Authors

representation bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition

Jinwoo Choi, Chen Gao, Joseph C. E. Messou, Jia-Bin Huang

Neural Information Processing SystemsFeb-13-2026, 12:55:57 GMT

Such biases are known asrepresentation bias [38].

artificial intelligence, incvpr, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e8507db80464ced5658d16b49bd458b9-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-12-2026, 13:57:39 GMT

computer vision and pattern recognition, dataset, downstream task, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

HowTransferableareVideoRepresentationsBasedon SyntheticData?

Neural Information Processing SystemsFeb-12-2026, 13:57:35 GMT

We posit that the gap between real and synthetic action representations can be attributed to contextual bias and static objects related to the action, instead of the temporal dynamics of the action itself.

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Communications (0.69)

Add feedback

Class-Incremental Learning via Dual Augmentation

Neural Information Processing SystemsDec-24-2025, 08:05:34 GMT

Deep learning systems typically suffer from catastrophic forgetting of past knowledge when acquiring new skills continually. In this paper, we emphasize two dilemmas, representation bias and classifier bias in class-incremental learning, and present a simple and novel approach that employs explicit class augmentation (classAug) and implicit semantic augmentation (semanAug) to address the two biases, respectively. On the one hand, we propose to address the representation bias by learning transferable and diverse representations. Specifically, we investigate the feature representations in incremental learning based on spectral analysis and present a simple technique called classAug, to let the model see more classes during training for learning representations transferable across classes. On the other hand, to overcome the classifier bias, semanAug implicitly involves the simultaneous generating of an infinite number of instances of old classes in the deep feature space, which poses tighter constraints to maintain the decision boundary of previously learned classes. Without storing any old samples, our method can perform comparably with representative data replay based approaches.

class-incremental learning, dual augmentation, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Uncovering Representation Bias for Investment Decisions in Open-Source Large Language Models

Dimino, Fabrizio, Saxena, Krati, Sarmah, Bhaskarjit, Pasquali, Stefano

arXiv.org Artificial IntelligenceNov-4-2025

Large Language Models are increasingly adopted in financial applications to support investment workflows. However, prior studies have seldom examined how these models reflect biases related to firm size, sector, or financial characteristics, which can significantly impact decision-making. This paper addresses this gap by focusing on representation bias in open-source Qwen models. We propose a balanced round-robin prompting method over approximately 150 U.S. equities, applying constrained decoding and token-logit aggregation to derive firm-level confidence scores across financial contexts. Using statistical tests and variance analysis, we find that firm size and valuation consistently increase model confidence, while risk factors tend to decrease it. Confidence varies significantly across sectors, with the Technology sector showing the greatest variability. When models are prompted for specific financial categories, their confidence rankings best align with fundamental data, moderately with technical signals, and least with growth indicators. These results highlight representation bias in Qwen models and motivate sector-aware calibration and category-conditioned evaluation protocols for safe and fair financial LLM deployment.

artificial intelligence, large language model, natural language, (11 more...)

arXiv.org Artificial Intelligence

2510.05702

Genre:

Financial News (1.00)
Research Report > New Finding (0.95)
Research Report > Experimental Study (0.71)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems

Cokbas, Mertcan, Liu, Ziteng, Tao, Zeyi, Veliz, Elder, Huang, Qin, Wen, Ellie, Li, Huayu, Jin, Qiang, Duman, Murat, Au, Benjamin, Lebanon, Guy, Chordia, Sagar, Zhang, Chengkai

arXiv.org Artificial IntelligenceOct-6-2025

Training large-scale recommendation models under a single global objective implicitly assumes homogeneity across user populations. However, real-world data are composites of heterogeneous cohorts with distinct conditional distributions. As models increase in scale and complexity and as more data is used for training, they become dominated by central distribution patterns, neglecting head and tail regions. This imbalance limits the model's learning ability and can result in inactive attention weights or dead neurons. In this paper, we reveal how the attention mechanism can play a key role in factorization machines for shared embedding selection, and propose to address this challenge by analyzing the substructures in the dataset and exposing those with strong distributional contrast through auxiliary learning. Unlike previous research, which heuristically applies weighted labels or multi-task heads to mitigate such biases, we leverage partially conflicting auxiliary labels to regularize the shared representation. This approach customizes the learning process of attention layers to preserve mutual information with minority cohorts while improving global performance. We evaluated C2AL on massive production datasets with billions of data points each for six SOTA models. Experiments show that the factorization machine is able to capture fine-grained user-ad interactions using the proposed method, achieving up to a 0.16% reduction in normalized entropy overall and delivering gains exceeding 0.30% on targeted minority cohorts.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.02215

Country: North America > United States (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.65)

Add feedback

AI Teams Contend With Synthetic Data's Jekyll/Hyde Roles

Communications of the ACMAug-26-2025, 16:03:13 GMT

Training models with synthetic data presents both a danger and a boon to artificial intelligence (AI). While some groups have aggressively pursued the use of model-generated data to train successors for greater accuracy and generalization, others have warned about the risks posed by AI ingesting its own output. The two views are not at odds. The question is when and where things go wrong. On the negative side, a flurry of papers published since 2021 have argued that, as the datasets used to pretrain foundation models incorporate more and more auto-generated data mined from the Internet, performance degrades and the models start to "unlearn" skills.

large language model, machine learning, natural language, (16 more...)

Communications of the ACM

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Europe > Germany > Brandenburg > Potsdam (0.05)
Asia > China (0.05)

Genre: Research Report (0.35)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

e8507db80464ced5658d16b49bd458b9-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-19-2025, 15:27:12 GMT

Interestingly, for HMDB51, the Synthetic pre-train dataset has more overlapping classes, yet the Kinetics pre-trained model still outperforms on this downstream task.

artificial intelligence, downstream task, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

e8507db80464ced5658d16b49bd458b9-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-19-2025, 15:27:08 GMT

Action recognition has improved dramatically with massive-scale video datasets.

artificial intelligence, dataset, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.67)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Security & Privacy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Representation biases: will we achieve complete understanding by analyzing representations?

Lampinen, Andrew Kyle, Chan, Stephanie C. Y., Li, Yuxuan, Hermann, Katherine

arXiv.org Artificial IntelligenceAug-14-2025

A common approach in neuroscience is to study neural representations as a means to understand a system -- increasingly, by relating the neural representations to the internal representations learned by computational models. However, a recent work in machine learning (Lampinen, 2024) shows that learned feature representations may be biased to over-represent certain features, and represent others more weakly and less-consistently. For example, simple (linear) features may be more strongly and more consistently represented than complex (highly nonlinear) features. These biases could pose challenges for achieving full understanding of a system through representational analysis. In this perspective, we illustrate these challenges -- showing how feature representation biases can lead to strongly biased inferences from common analyses like PCA, regression, and RSA. We also present homomorphic encryption as a simple case study of the potential for strong dissociation between patterns of representation and computation. We discuss the implications of these results for representational comparisons between systems, and for neuroscience more generally.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2507.22216

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback