Multi-level Multiple Instance Learning with Transformer for Whole Slide Image Classification
Zhang, Ruijie, Zhang, Qiaozhe, Liu, Yingzhuang, Xin, Hao, Liu, Yan, Wang, Xinggang
–arXiv.org Artificial Intelligence
Whole slide image (WSI) refers to a type of high-resolution scanned tissue image, which is extensively employed in computer-assisted diagnosis (CAD). The extremely high resolution and limited availability of region-level annotations make employing deep learning methods for WSI-based digital diagnosis challenging. Recently integrating multiple instance learning (MIL) and Transformer for WSI analysis shows very promising results. However, designing effective Transformers for this weakly-supervised high-resolution image analysis is an underexplored yet important problem. In this paper, we propose a Multi-level MIL (MMIL) scheme by introducing a hierarchical structure to MIL, which enables efficient handling of MIL tasks involving a large number of instances. Based on MMIL, we instantiated MMIL-Transformer, an efficient Transformer model with windowed exact self-attention for large-scale MIL tasks. To validate its effectiveness, we conducted a set of experiments on WSI classification tasks, where MMIL-Transformer demonstrate superior performance compared to existing state-of-the-art methods, i.e., 96.80% test AUC and 97.67% test accuracy on the CAMELYON16 dataset, 99.04% test AUC and 94.37% test accuracy on the TCGA-NSCLC dataset, respectively. All code and pre-trained models are available at: https://github.com/hustvl/MMIL-Transformer
arXiv.org Artificial Intelligence
Sep-5-2023
- Country:
- Asia > China (0.14)
- South America > Peru (0.14)
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Technology: