ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts

Jun-14-2026, 17:37:53 GMT–Neural Information Processing Systems

Dataset bias, where data points are skewed to certain concepts, is ubiquitous in machine learning datasets. Yet, systematically identifying these biases is challenging without costly, fine-grained attribute annotations. We present ConceptScope, a scalable and automated framework for analyzing visual datasets by discovering and quantifying human-interpretable concepts using Sparse Autoencoders trained on representations from vision foundation models. ConceptScope categorizes concepts into target, context, and bias types based on their semantic relevance and statistical correlation to class labels, enabling class-level dataset characterization, bias identification, and robustness evaluation through concept-based subgrouping.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Jun-14-2026, 17:37:53 GMT

Conferences PDF

Add feedback

Country:
- Europe (0.46)
- North America (0.28)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Health & Medicine (0.67)
- Transportation > Ground
  - Rail (0.93)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Data Science (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Representation & Reasoning (1.00)
    - Natural Language > Large Language Model (1.00)
    - Machine Learning
      - Neural Networks > Deep Learning (1.00)
      - Performance Analysis > Accuracy (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found