Witowski, Jan
Multi-modal AI for comprehensive breast cancer prognostication
Witowski, Jan, Zeng, Ken, Cappadona, Joseph, Elayoubi, Jailan, Chiru, Elena Diana, Chan, Nancy, Kang, Young-Joon, Howard, Frederick, Ostrovnaya, Irina, Fernandez-Granda, Carlos, Schnabel, Freya, Ozerdem, Ugur, Liu, Kangning, Steinsnyder, Zoe, Thakore, Nitya, Sadic, Mohammad, Yeung, Frank, Liu, Elisa, Hill, Theodore, Swett, Benjamin, Rigau, Danielle, Clayburn, Andrew, Speirs, Valerie, Vetter, Marcus, Sojak, Lina, Soysal, Simone Muenst, Baumhoer, Daniel, Choucair, Khalil, Zong, Yu, Daoud, Lina, Saad, Anas, Abdulsattar, Waleed, Beydoun, Rafic, Pan, Jia-Wern, Makmur, Haslina, Teo, Soo-Hwang, Pak, Linda Ma, Angel, Victor, Zilenaite-Petrulaitiene, Dovile, Laurinavicius, Arvydas, Klar, Natalie, Piening, Brian D., Bifulco, Carlo, Jun, Sun-Young, Yi, Jae Pak, Lim, Su Hyun, Brufsky, Adam, Esteva, Francisco J., Pusztai, Lajos, LeCun, Yann, Geras, Krzysztof J.
Treatment selection in breast cancer is guided by molecular subtypes and clinical characteristics. Recurrence risk assessment plays a crucial role in personalizing treatment. Current methods, including genomic assays, have limited accuracy and clinical utility, leading to suboptimal decisions for many patients. We developed a test for breast cancer patient stratification based on digital pathology and clinical characteristics using novel AI methods. Specifically, we utilized a vision transformer-based pan-cancer foundation model trained with self-supervised learning to extract features from digitized H&E-stained slides. These features were integrated with clinical data to form a multi-modal AI test predicting cancer recurrence and death. The test was developed and evaluated using data from a total of 8,161 breast cancer patients across 15 cohorts originating from seven countries. Of these, 3,502 patients from five cohorts were used exclusively for evaluation, while the remaining patients were used for training. Our test accurately predicted our primary endpoint, disease-free interval, in the five external cohorts (C-index: 0.71 [0.68-0.75], HR: 3.63 [3.02-4.37, p<0.01]). In a direct comparison (N=858), the AI test was more accurate than Oncotype DX, the standard-of-care 21-gene assay, with a C-index of 0.67 [0.61-0.74] versus 0.61 [0.49-0.73], respectively. Additionally, the AI test added independent information to Oncotype DX in a multivariate analysis (HR: 3.11 [1.91-5.09, p<0.01)]). The test demonstrated robust accuracy across all major breast cancer subtypes, including TNBC (C-index: 0.71 [0.62-0.81], HR: 3.81 [2.35-6.17, p=0.02]), where no diagnostic tools are currently recommended by clinical guidelines. These results suggest that our AI test can improve accuracy, extend applicability to a wider range of patients, and enhance access to treatment selection tools.
An efficient deep neural network to find small objects in large 3D images
Park, Jungkyu, Chłędowski, Jakub, Jastrzębski, Stanisław, Witowski, Jan, Xu, Yanqi, Du, Linda, Gaddam, Sushma, Kim, Eric, Lewin, Alana, Parikh, Ujas, Plaunova, Anastasia, Chen, Sardius, Millet, Alexandra, Park, James, Pysarenko, Kristine, Patel, Shalin, Goldberg, Julia, Wegener, Melanie, Moy, Linda, Heacock, Laura, Reig, Beatriu, Geras, Krzysztof J.
3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alternative, a neural network that enables efficient classification of full-resolution 3D medical images. Compared to off-the-shelf convolutional neural networks, our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC), uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation. While it is trained only with image-level labels, without segmentation labels, it explains its predictions by providing pixel-level saliency maps. On a dataset collected at NYU Langone Health, including 85,526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95% CI: 0.769-0.887) in classifying breasts with malignant findings using 3D mammography. This is comparable to the performance of GMIC on FFDM (0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI: 0.754-0.884), which demonstrates that 3D-GMIC successfully classified large 3D images despite focusing computation on a smaller percentage of its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes extremely small regions of interest from 3D images consisting of hundreds of millions of pixels, dramatically reducing associated computational challenges. 3D-GMIC generalizes well to BCS-DBT, an external dataset from Duke University Hospital, achieving an AUC of 0.848 (95% CI: 0.798-0.896).