Mohareri, Omid
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Psychogyios, Dimitrios, Colleoni, Emanuele, Van Amsterdam, Beatrice, Li, Chih-Yang, Huang, Shu-Yu, Li, Yuchong, Jia, Fucang, Zou, Baosheng, Wang, Guotai, Liu, Yang, Boels, Maxence, Huo, Jiayu, Sparks, Rachel, Dasgupta, Prokar, Granados, Alejandro, Ourselin, Sebastien, Xu, Mengya, Wang, An, Wu, Yanan, Bai, Long, Ren, Hongliang, Yamada, Atsushi, Harai, Yuriko, Ishikawa, Yuto, Hayashi, Kazuyuki, Simoens, Jente, DeBacker, Pieter, Cisternino, Francesco, Furnari, Gabriele, Mottrie, Alex, Ferraguti, Federica, Kondo, Satoshi, Kasai, Satoshi, Hirasawa, Kousuke, Kim, Soohee, Lee, Seung Hyun, Lee, Kyu Eun, Kong, Hyoun-Joong, Fu, Kui, Li, Chao, An, Shan, Krell, Stefanie, Bodenstedt, Sebastian, Ayobi, Nicolas, Perez, Alejandra, Rodriguez, Santiago, Puentes, Juanita, Arbelaez, Pablo, Mohareri, Omid, Stoyanov, Danail
Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segmentation algorithms are often trained and make predictions in isolation from each other, without exploiting potential cross-task relationships. With the EndoVis 2022 SAR-RARP50 challenge, we release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP). The aim of the challenge is twofold. First, to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain. Second, to further explore the potential of multitask-based learning approaches and determine their comparative advantage against their single-task counterparts. A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation. The complete SAR-RARP50 dataset is available at: https://rdr.ucl.ac.uk/projects/SARRARP50_Segmentation_of_surgical_instrumentation_and_Action_Recognition_on_Robot-Assisted_Radical_Prostatectomy_Challenge/191091
AdaEmbed: Semi-supervised Domain Adaptation in the Embedding Space
Mottaghi, Ali, Jamal, Mohammad Abdullah, Yeung, Serena, Mohareri, Omid
Semi-supervised domain adaptation (SSDA) presents a critical hurdle in computer vision, especially given the frequent scarcity of labeled data in real-world settings. This scarcity often causes foundation models, trained on extensive datasets, to underperform when applied to new domains. AdaEmbed, our newly proposed methodology for SSDA, offers a promising solution to these challenges. Leveraging the potential of unlabeled data, AdaEmbed facilitates the transfer of knowledge from a labeled source domain to an unlabeled target domain by learning a shared embedding space. By generating accurate and uniform pseudo-labels based on the established embedding space, the model overcomes the limitations of conventional SSDA, thus enhancing performance significantly. Our method's effectiveness is validated through extensive experiments on benchmark datasets such as DomainNet, Office-Home, and VisDA-C, where AdaEmbed consistently outperforms all the baselines, setting a new state of the art for SSDA. With its straightforward implementation and high data efficiency, AdaEmbed stands out as a robust and pragmatic solution for real-world scenarios, where labeled data is scarce. To foster further research and application in this area, we are sharing the codebase of our unified framework for semi-supervised domain adaptation.