Hajek, Jonas
SurgT challenge: Benchmark of Soft-Tissue Trackers for Robotic Surgery
Cartucho, Joao, Weld, Alistair, Tukra, Samyakh, Xu, Haozheng, Matsuzaki, Hiroki, Ishikawa, Taiyo, Kwon, Minjun, Jang, Yong Eun, Kim, Kwang-Ju, Lee, Gwang, Bai, Bizhe, Kahrs, Lueder, Boecking, Lars, Allmendinger, Simeon, Muller, Leopold, Zhang, Yitong, Jin, Yueming, Bano, Sophia, Vasconcelos, Francisco, Reiter, Wolfgang, Hajek, Jonas, Silva, Bruno, Lima, Estevao, Vilaca, Joao L., Queiros, Sandro, Giannarou, Stamatia
This paper introduces the ``SurgT: Surgical Tracking" challenge which was organised in conjunction with MICCAI 2022. There were two purposes for the creation of this challenge: (1) the establishment of the first standardised benchmark for the research community to assess soft-tissue trackers; and (2) to encourage the development of unsupervised deep learning methods, given the lack of annotated data in surgery. A dataset of 157 stereo endoscopic videos from 20 clinical cases, along with stereo camera calibration parameters, have been provided. Participants were assigned the task of developing algorithms to track the movement of soft tissues, represented by bounding boxes, in stereo endoscopic videos. At the end of the challenge, the developed methods were assessed on a previously hidden test subset. This assessment uses benchmarking metrics that were purposely developed for this challenge, to verify the efficacy of unsupervised deep learning algorithms in tracking soft-tissue. The metric used for ranking the methods was the Expected Average Overlap (EAO) score, which measures the average overlap between a tracker's and the ground truth bounding boxes. Coming first in the challenge was the deep learning submission by ICVS-2Ai with a superior EAO score of 0.617. This method employs ARFlow to estimate unsupervised dense optical flow from cropped images, using photometric and regularization losses. Second, Jmees with an EAO of 0.583, uses deep learning for surgical tool segmentation on top of a non-deep learning baseline method: CSRT. CSRT by itself scores a similar EAO of 0.563. The results from this challenge show that currently, non-deep learning methods are still competitive. The dataset and benchmarking tool created for this challenge have been made publicly available at https://surgt.grand-challenge.org/.
CholecTriplet2022: Show me a tool and tell me the triplet -- an endoscopic vision challenge for surgical action triplet detection
Nwoye, Chinedu Innocent, Yu, Tong, Sharma, Saurav, Murali, Aditya, Alapatt, Deepak, Vardazaryan, Armine, Yuan, Kun, Hajek, Jonas, Reiter, Wolfgang, Yamlahi, Amine, Smidt, Finn-Henri, Zou, Xiaoyang, Zheng, Guoyan, Oliveira, Bruno, Torres, Helena R., Kondo, Satoshi, Kasai, Satoshi, Holm, Felix, Özsoy, Ege, Gui, Shuangchun, Li, Han, Raviteja, Sista, Sathish, Rachana, Poudel, Pranav, Bhattarai, Binod, Wang, Ziheng, Rui, Guo, Schellenberg, Melanie, Vilaça, João L., Czempiel, Tobias, Wang, Zhenkun, Sheet, Debdoot, Thapa, Shrawan Kumar, Berniker, Max, Godau, Patrick, Morais, Pedro, Regmi, Sudarshan, Tran, Thuy Nuong, Fonseca, Jaime, Nölke, Jan-Hinrich, Lima, Estevão, Vazquez, Eduard, Maier-Hein, Lena, Navab, Nassir, Mascagni, Pietro, Seeliger, Barbara, Gonzalez, Cristians, Mutter, Didier, Padoy, Nicolas
Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of