Masked Autoencoders are Scalable Learners of Cellular Morphology

Kraus, Oren, Kenyon-Dean, Kian, Saberian, Saber, Fallah, Maryam, McLean, Peter, Leung, Jess, Sharma, Vasudev, Khan, Ayla, Balakrishnan, Jia, Celik, Safiye, Sypetkowski, Maciej, Cheng, Chi Vicky, Morse, Kristen, Makes, Maureen, Mabey, Ben, Earnshaw, Berton

Nov-27-2023–arXiv.org Artificial Intelligence

Inferring biological relationships from cellular phenotypes in high-content microscopy screens provides significant opportunity and challenge in biological research. Prior results have shown that deep vision models can capture biological signal better than hand-crafted features. This work explores how self-supervised deep learning approaches scale when training larger models on larger microscopy datasets. Our results show that both CNN- and ViT-based masked autoencoders significantly outperform weakly supervised baselines. At the high-end of our scale, a ViT-L/8 trained on over 3.5-billion unique crops sampled from 93-million microscopy images achieves relative improvements as high as 28% over our best weakly supervised baseline at inferring known biological relationships curated from public databases. Relevant code and select models released with this work can be found at: https://github.com/recursionpharma/maes_microscopy.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Nov-27-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East (0.15)
- Europe > Germany (0.14)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)