Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Donhauser, Konstantin, Ulicna, Kristina, Moran, Gemma Elyse, Ravuri, Aditya, Kenyon-Dean, Kian, Eastwood, Cian, Hartford, Jason
Dictionary learning (DL) has emerged as a powerful interpretability tool for large language models. By extracting known concepts (e.g., Golden-Gate Bridge) from human-interpretable data (e.g., text), sparse DL can elucidate a model's inner workings. In this work, we ask if DL can also be used to discover unknown concepts from less human-interpretable scientific data (e.g., cell images), ultimately enabling modern approaches to scientific discovery. As a first step, we use DL algorithms to study microscopy foundation models trained on multi-cell image data, where little prior knowledge exists regarding which high-level concepts should arise. We show that sparse dictionaries indeed extract biologically-meaningful concepts such as cell type and genetic perturbation type. We also propose a new DL algorithm, Iterative Codebook Feature Learning~(ICFL), and combine it with a pre-processing step that uses PCA whitening from a control dataset. In our experiments, we demonstrate that both ICFL and PCA improve the selectivity of extracted features compared to TopK sparse autoencoders.
Dec-19-2024
- Country:
- Pacific Ocean > North Pacific Ocean
- San Francisco Bay > Golden Gate (0.24)
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- United Kingdom > England
- Pacific Ocean > North Pacific Ocean
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Technology: