concept label
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (3 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (9 more...)
When concept-based XAI is imprecise: Do people distinguish between generalisations and misrepresentations?
Concept-based explainable artificial intelligence (C-XAI) can let people see which representations an AI model has learned. This is particularly important when high-level semantic information (e.g., actions and relations) is used to make decisions about abstract categories (e.g., danger). In such tasks, AI models need to generalise beyond situation-specific details, and this ability can be reflected in C-XAI outputs that randomise over irrelevant features. However, it is unclear whether people appreciate such generalisation and can distinguish it from other, less desirable forms of imprecision in C-XAI outputs. Therefore, the present study investigated how the generality and relevance of C-XAI outputs affect people's evaluation of AI. In an experimental railway safety evaluation scenario, participants rated the performance of a simulated AI that classified traffic scenes involving people as dangerous or not. These classification decisions were explained via concepts in the form of similar image snippets. The latter differed in their match with the classified image, either regarding a highly relevant feature (i.e., people's relation to tracks) or a less relevant feature (i.e., people's action). Contrary to the hypotheses, concepts that generalised over less relevant features were rated lower than concepts that matched the classified image precisely. Moreover, their ratings were no better than those for systematic misrepresentations of the less relevant feature. Conversely, participants were highly sensitive to imprecisions in relevant features. These findings cast doubts on the assumption that people can easily infer from C-XAI outputs whether AI models have gained a deeper understanding of complex situations.
- Europe > Germany > Saxony > Dresden (0.04)
- North America > Canada > Ontario (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (2 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
Curriculum Abductive Learning
Hu, Wen-Chao, Li, Qi-Jie, Jia, Lin-Han, Ge, Cunjing, Li, Yu-Feng, Jiang, Yuan, Zhou, Zhi-Hua
Abductive Learning (ABL) integrates machine learning with logical reasoning in a loop: a learning model predicts symbolic concept labels from raw inputs, which are revised through abduction using domain knowledge and then fed back for retraining. However, due to the nondeterminism of abduction, the training process often suffers from instability, especially when the knowledge base is large and complex, resulting in a prohibitively large abduction space. While prior works focus on improving candidate selection within this space, they typically treat the knowledge base as a static black box. In this work, we propose Curriculum Abductive Learning (C-ABL), a method that explicitly leverages the internal structure of the knowledge base to address the ABL training challenges. C-ABL partitions the knowledge base into a sequence of sub-bases, progressively introduced during training. This reduces the abduction space throughout training and enables the model to incorporate logic in a stepwise, smooth way. Experiments across multiple tasks show that C-ABL outperforms previous ABL implementations, significantly improves training stability, convergence speed, and final accuracy, especially under complex knowledge setting.
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (3 more...)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (3 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > Italy (0.04)
- Europe > Austria > Upper Austria > Linz (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization
Penaloza, Emiliano, Zhang, Tianyue H., Charlin, Laurent, Zarlenga, Mateo Espinosa
Concept Bottleneck Models (CBMs) propose to enhance the trustworthiness of AI systems by constraining their decisions on a set of human-understandable concepts. However, CBMs typically assume that datasets contain accurate concept labels-an assumption often violated in practice, which we show can significantly degrade performance (by 25% in some cases). To address this, we introduce the Concept Preference Optimization (CPO) objective, a new loss function based on Direct Preference Optimization, which effectively mitigates the negative impact of concept mislabeling on CBM performance. We provide an analysis of key properties of the CPO objective, showing it directly optimizes for the concept's posterior distribution, and contrast it against Binary Cross Entropy (BCE), demonstrating that CPO is inherently less sensitive to concept noise. We empirically confirm our analysis by finding that CPO consistently outperforms BCE on three real-world datasets, both with and without added label noise. We make our code available on Github.
- North America > United States > California (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (0.92)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (9 more...)
Interpretable Reward Modeling with Active Concept Bottlenecks
Laguna, Sonia, Kobalczyk, Katarzyna, Vogt, Julia E., Van der Schaar, Mihaela
We introduce Concept Bottleneck Reward Models (CB-RM), a reward modeling framework that enables interpretable preference learning through selective concept annotation. Unlike standard RLHF methods that rely on opaque reward functions, CB-RM decomposes reward prediction into human-interpretable concepts. To make this framework efficient in low-supervision settings, we formalize an active learning strategy that dynamically acquires the most informative concept labels. We propose an acquisition function based on Expected Information Gain and show that it significantly accelerates concept learning without compromising preference accuracy. Evaluated on the UltraFeedback dataset, our method outperforms baselines in interpretability and sample efficiency, marking a step towards more transparent, auditable, and human-aligned reward models.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada (0.04)
Concept Bottleneck Large Language Models
Sun, Chung-En, Oikarinen, Tuomas, Ustun, Berk, Weng, Tsui-Wei
We introduce the Concept Bottleneck Large Language Model (CB-LLM), a pioneering approach to creating inherently interpretable Large Language Models (LLMs). Unlike traditional black-box LLMs that rely on post-hoc interpretation methods with limited neuron function insights, CB-LLM sets a new standard with its built-in interpretability, scalability, and ability to provide clear, accurate explanations. We investigate two essential tasks in the NLP domain: text classification and text generation. In text classification, CB-LLM narrows the performance gap with traditional black-box models and provides clear interpretability. In text generation, we show how interpretable neurons in CB-LLM can be used for concept detection and steering text generation. Our CB-LLMs enable greater interaction between humans and LLMs across a variety of tasks -- a feature notably absent in existing LLMs. Large Language Models (LLMs) have become instrumental in advancing Natural Language Processing (NLP) tasks.
- Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > Spain (0.04)
- (24 more...)
- Research Report > Promising Solution (0.34)
- Overview > Innovation (0.34)
- Media (1.00)
- Banking & Finance (1.00)
- Education (0.93)
- (3 more...)